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INTRODUCTION 
In concluding a popular exposition of probability in quantum mechanics 
(QM), Richard Feynman says ‘But the deep mystery is what I have de- 
scribed, and no one can go any further today.’ Thus Feynman poses a 
challenge by sanctioning an attitude common among scientists, the 
attitude that quantum physics is mysterious. Our probabilistic notions, 
our logic and the most basic guidelines of common sense seem to go awry 
when set in the quantum domain. I want to take up Feynman’s challenge 
here by showing that this is not the case; that is, I want to argue that 
things are perfectly all right in QM. This is, however, a complex argument 
to make in general. And so I shall attempt to make it in two stages. In the 
first stage, which comprises Sections x to 4, I shall discuss a particular 
case in a way that will, I hope, allow for generalisation. 

Precisely because it has been so often discussed, the case I shall look at 


* Portions of this work were read at The University of Indiana, at Cornell University, at 
The Boston Colloquium for the Philosophy of Science and at the Rockefeller Univ- 
ersity. I want to thank the discussants on these occasions. A special word of thanks is 
due Abner Shimony for emphasising to me the objection -raised here at the end of 
Section 6 and dealt with in the next two sections. I want also to thank the referees, 

1 Feynman [1965], p. 145. 
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is that of the two-slit experiment. I can use this case to point out the 
tension that Feynman and others see between probability and logic and 
common sense. And I can point out in just this context what seems to me 
to be the obvious way to resolve the tension. The way has to do with how 
QM uses probability. The important feature of the use of probability in 
QM is this. In QM attributions of probability are not derived from an 
underlying phase space. This contrasts, of course, with the situation in 
statistical mechanics. I shall argue, however, that precisely this feature is a 
mark of the classical nature of probability in QM. The principle behind 
this argument is that one and the same concept can have several legitimate 
uses. Thus I contend that although there are differences in usage between. 
‘probability’ in statistical mechanics and ‘probability’ in QM, the concept 
of probability is the same classical concept in both cases. I shall make the 
argument of this first stage in two pieces. First I shall suggest that prob- 
ability derives from a root notion of chance and that the logic of this root 
notion sanctions precisely the kind of use one finds in QM. Secondly, I 
shall sketch the probabilistic setting of QM and point out that this setting 
is merely an instance of a general framework for statistical theories; that is, 
for theories that employ probability. This framework applies to QM, to 
statistical mechanics as well as to the most diverse scientific cases—and the 
framework uses nothing but the classical concept of probability. 

The result of this first stage is to dissolve the tension and mystery that 
surrounds the quantum theoretic account of the two-slit experiment by 
presenting a version of that account which employs ordinary logic and the 
classical concept of probability to implement a common sense view of how 
the experiment goes. In the second stage of my argument, which comprises 
sections 5 through xo, I extend this harmonious picture of the two-slit 
experiment to the general formalism of elementary quantum theory. 

This extension involves developing a classical propositional language 
for QM, laying down conditions for the admissible valuations of this 
language and then actually providing two different families of classical, 
bivalent valuations that are admissible. This work makes use of a feature 
derived from the picture developed in stage one. This feature would allow 
a quantity to have a precise value even if the state of the system is not the 
corresponding eigenstate. Breaking the eigenvalue-eigenstate link in this 
way is presupposed by the common sense account of the two-slit experi- 
ment. Using this feature enables one to carry on the work of stage two 
without falling prey tp the no-hidden-variable proofs. The work of this 
stage does, however, fall under the proof that ‘joint distributions’ exist 
just in the case of commuting operators. In section xo this result is demon- 
strated for the logical framework of this second stage in order to show that 
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although this framework does generalise the picture of stage one, it does 
not go too far. For I show that the probabilities derivable from this frame- 
work are precisely those assigned by QM; nothing less, but also nothing 
more. 

One can view this second stage as a necessary technical underpinning 
for the intuitive picture of QM presented in stage one. It is, in effect, a 
demonstration that this intuitive picture is consistent with ordinary OM as 
a whole. It is, however, also something more. For the apparatus developed 
in stage two can be brought to bear on the so-called ‘paradoxes’ of OM 

_ (Einstein-Podolsky-Rosen and Schrédinger’s cat) and, indeed, on the 
measurement problem in general. (See Fine [1973].)Thus the harmonious 
picture developed in stage one and defended in stage two provides a 
setting in which the outstanding interpretive problems of OM find a res- 
olution that is intuitively acceptable as well as formally correct. 


I THE EXPERIMENT AND THE ARGUMENT}? 


Let us suppose we are conducting an electron experiment so that a filament 
acts as a source of electrons, spewing them out with a common energy 
towards a tungsten plate. On this plate, equidistant from the source, are 
two-holes, call them A and B, that are decently separated, and behind the 
plate—at a sufficient distance—is a sensitised screen. The result of the 
experiment will be a certain pattern of electron hits on the detecting 
screen. This is the so-called interference pattern. If we block up hole 4 
in the tungsten plate, leaving hole B open, then a different pattern will 
form on the detecting screen. Call it the B-pattern. Similarly if we block B 
we shall get a characteristic A-pattern. If we superimpose the A and B 
patterns then we get something that it is appropriate to call the additive 
pattern, for the number of electron hits in a given region of the additive 
pattern is simply the sum of the numbers of hits from the A and B patterns. 
A comparison of the additive and interference patterns will show that they 
are substantially different. For example, there are places where the inter- 
ference pattern shows a light patch—+.e. few electron hits—but where the 
additive pattern shows a dark patch of many hits. The pattern of dis- 
tribution of electrons that emerges from a given experiment gives rise to a 
probability distribution for the arrival of electrons at the various locations 
on the detecting screen. If X is a region on the screen, then the probability 
. for arrival at X should be proportional to the number of hits in region X. 

Consider now the following simple account of the two-hole experiment: 
1 The discussion in this section follows a similar discussion in my [1972]. The reader is 


referred to that article for the technical underpinnings as well as hedges to this account 
of the two-slit experiment. 
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each electron leaves the filament, passes the barrier set up by the tungsten 
plate either by going through hole A or by going through hole B, but not 
both, and then arrives to be detected on the screen. Suppose we treat this 
account as a hypothesis—I shall often refer to it as the hypothesis of 
mutually exclusive passage—and suppose we try to test it by means of the 
following argument. 
If the account is correct then every electron goes through A or through 
B, but not through both. In symbols, 
(Av B) a ~(A A B). (Hyp) 
Thus the probability Pr(X) that an electron arrives at location X on the 
detecting screen is the probability that the electron passes through A or 
through B and arrives at X. 
Pr(X) = Pr([A v B] ^ X). (x) 
By the distributive law this is the probability that either the electron 
passes through A and arrives at X or passes through B and arrives at X. 
Pr(X) = P(A ^ X] v [B ^ X). (2) 
By the law of total probability, 
Pr(X) = Pr(A a X)+ Pr(B a X)—PrIA AX ABAX). (3) 
By hypothesis, Pr(A ^ B) = o. Hence, ` 
PrA AX AaBaX)=o. (4) 
So, 
Pr(X) = P(A ^ X) + PHB ^ X). (5) 
Thus the probability for arrival at X is the sum of two terms, each 
term being the probability for passing through one hole and arriving at X. 
These probabilities, however, can be obtained from the single-slit experi- 
ments; namely they are the probabilities corresponding to the A-pattern 
and the B-pattern weighted, respectively, by the probability for passage 
through A and through B, i.e. 


Pr(A ^ X) = Pr(X/A). Pr(A) (62) 
and Pr(B ^ X) = Pr(X/B). Pr(B). (6b) 
Thus, 

Pr(X) = Pr(X/A). Pr(A) + Pr(X/B). Pr(B). (7) 


Since the holes A and B are equidistant from the electron source, we may 
assume that Pr(4) = Pr(B). (This assumption is not essential to the . 
argument, but it does make the arithmetic simpler). This yields 

Pr(X) oc Pr(X/A) + Pr(X/B). (8) 
Therefore the probability for arrival at X is just the probability corre- 
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sponding to the additive pattern. If our hypothetical account were correct, 
then, we should find that the two-hole experiment gives rise to the additive 
pattern on the detecting screen. In fact, we find the interference pattern, 
and thus we must reject this account. 

Unfortunately, however, the situation is not clear cut. We can attempt 
to check the hypothesis of mutually exclusive passage in a direct manner. 
We can, for example, place counters—perhaps light sources and photo- 
multipliers—around each hole that will register the passage of an electron. 
Then we can count the number of electrons that pass through the holes. If 
we do so, the numbers will tally and we can conclude that each electron 
went through one hole or the other, and none went through both holes. 
Oddly enough in this two-hole experiment with counters, where we check 
the passage of the electrons through the holes, it is the additive pattern and 
not the interference pattern that builds up on the detecting screen. 

I think that there is some inclination to accept the two-hole experiment 
with counters as providing direct experimental evidence for the mutually 
exclusive passage of electrons through the holes. Perhaps the scientific 
ideal of objectivity contributes to this inclination. For the alternative 
seems to be this: one says that when the electrons are observed then they, 
exclusively, pass either through one hole or the other; whereas if they are 
not observed then they do not. If observation is to be an objective guide to 
reliable information, however, then what we observe must correspond to 
how things are, either simultaneous with or just prior to our observations. 
Thus just prior to our observation of an electron at the outlet of hole A, 
the electron must have been passing through hole A. This is, of course, 
compatible with our observation of the electrons disturbing them in such a 
manner that subsequently the additive and not the interference pattern is 
formed. For the moment suppose we are thus inclined to accept the 
hypothesis of the mutually exclusive passage of the electrons through the 
holes, Then, somehow, we must fault the argument that leads from this 
hypothesis to the rejection of the interference pattern; for I assume that 
we accept the occurrence of that pattern as well. In expounding that 
argument I have tried to bring out the features that one might fault. There 
seem to be just three of them: (i) the identification of Pr(A ^ X), 
Pr(B a X) with the A-pattern, B-pattern probabilities given by equations 
(6); (##) the use of the law of total probability, that is, the formula 


PAU v V) = PU) + Pr(V)— PU ^ V), 
to move from equation (2) to equation (3); and (i) the use of the dis- 
tributive law, that is, the law asserting the equivalence of i 


(h1 V $a) A $3 
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+ 


with 


(¢1 A $s) V (¢2 A $a) 


to move from equation (1) to equation (2). 

The first feature (i) involves an obvious use of conditional probability. 
For if the probabilities are well defined, then the probability that an 
electron passes through A and arrives at X is just the conditional prob- 
ability for arrival at X given that the electron has passed through A, 
multiplied by the probability for passage through A, i.e. equation (6a). 
One cannot fault this formula since it employs nothing more than the 
definition of conditional probability. The perhaps questionable move, 
then, must be the identification of the probability derived from the A- 
pattern—that is, the probability for arrival at X in an experiment with 
just hole A open—with the conditional probability for arrival at X given 
that the electron has passed through A, Pr(X/A). There is, it seems to me, 
some room here for doubt and, if I understand him correctly, just this 
point has been made by Professor Bernard Koopman.? Koopman em- 
phasises that what I have called the probability for arrival at X in the 
A-pattern and the probability for arrival at X in the interference pattern 
are not the probabilities of two outcomes of one and the same experiment 
(namely, the two-slit experiment) but are rather the probabilities for the 
same outcome (arrival at X) in two different experiments (an A-hole and a 
two-hole experiment). Thus Koopman, if I read him fairly, would point 
out that the conditional probability for arrival at X, given that hole A has 
been traversed, Pr(X/A), is a probability defined with reference to the 
two-hole experiment and that there is no reason a priori to identify this 
probability with the probable outcome of an entirely different, A-hole 
experimental arrangement. There is, of course, no reason a posteriori 
either, since such identification conflicts with the observed interference 
pattern. 

The emphasis placed here on an analysis in terms of experimental 
conditions follows the procedure that was always advised by Niels Bohr. It 
is surely sound advice, in general, but in the present context it is not at all 
conclusive. Recall that we are operating under the assumption that in the 
two-hole experiment each electron goes through exactly one hole. We can 
now ask whether the state of an electron that has just passed through hole 
A in the two-hole experiment would be any different from the state of an 
electron that has just gone through hole A in the A-hole experiment. This 
is a question phrased in the ‘state’ language of quantum theory and it is to 
that theory that we must look for an answer. According to the way the 


1 Koopman [1957]. 
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theory is usually employed, to say that the electron has just passed through 
hole Æ implies that repeated attempts to locate the electron would always 
find it in some small region R around the hole. It appears then that the 
state of the electron would be given by a psi-function whose representation, 
in ordinary 3-space, would be localised in region R. It thus seems that the 
states after passage through hole Æ would be the same in both single and 
double hole experiments. Since identical states will give rise to identical 
probabilities for arrival at X, it is a consequence of this quantum theoretic 
analysis together with the assumption that the electron has passed through 
one hole or another that the conditional probabilities for the two-hole 
experiment can be identified with the probabilities derived from the A and 
B hole patterns.1 If this defence of the ascription of probabilities that lead 
to the additive pattern is cogent, then we must look to other features of the 
argument if we are to save the hypothesis of exclusive passage. 

The second feature (i) was the law of total probability. Given the 
assumption that each electron goés through one hole or another, there can 
be no question concerning the applicability of the law. Thus criticism 
must be of the very law itself. Just here, indeed, we find that commentators 
like Professor Feynman have suggested that the probabilistic calculus 
employed by QM deviates from the classical one by including violations of 
the law of total probability. That law, however, is deeply entrenched in the 
classical notion of probability, for it secures the additivity of probability 
over disjoint classes. T'o abandon the law of total probability is to admit the 
non-classical nature of probability in QM. If we are reluctant to do this, 
however, then there appears to be only one last feature of the argument on 
which to fall back. 

This last feature is the distributive law of ordinary propositional logic. 
‘This law is used to move from the conjunction 

(Av BAX 
to the disjunction 
(A A X)v (Ba X). 
It is the law of total probability applied to this disjunction that leads to the 
additive pattern probabilities. Hence we could avoid the additive pattern 
and save the hypothesis of mutually exclusive passage by denying this 
instance of the distributive law of logic. 


1 I argue here that if passing through hole A implies that the state is given by a delta 
function localised around hole A and if Pr( X/A) is well-defined in the two hole experi- 
ment, then Pr(.X/A) is the same as the probability for arrival at X in the A-hole experi- 
ment. M. Gardner seems to admit both parts of the antecedent here (see his [1971], 
Section 2 and Section 4) but to deny the conclusion (Section 5). Gardner’s way with the 
two-slit experiment is thus inconsistent. My own way, as will emerge below, is to deny 
both parts of the antecedent. 
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If common sense dictates that for an electron to get from the source to 
the detecting screen it just must go through one hole or another; then we 
see here the tension between common sense, common logic and ordinary 
probability. It seems to me that there is an obvious way to resolve the 
tension. Before pointing it out, however, it will be helpful to have an 
overview of how QM treats the two-hole experiment. 


2 SAVING THE HYPOTHESIS: PROBABILITY 


As I understand how quantum theory views the two-hole experiment, 
there are two packages of information that the theory makes accessible. 
There is first the probabilities that the electron goes through one hole or 
the other. In the language of classical probability theory, we have a two 
element sample space—I shall call it the barrier space—consisting of the 
event ‘the electron goes through hole A’ and the event ‘the electron goes 
through hole B’. We have assumed that these events are equally likely, and 
thus they each have probability 1/2. The second package of information 
concerns the probability for arrival at region X on the detecting screen. 
We can codify this information in another two element sample space—call 
it the receiver space—consisting of the event ‘the electron arrives at X’ and 
the event ‘the electron does not arrive at X’. The probabilities here, as for 
the barrier space, are determined by the state function that quantum 
theory associates with the electron. We have now set up two finite prob- 
ability spaces and within each space, separately, the calculus of probability 
can be used to work out the probability for various compound events— 
conjunctions, disjunctions, etc. Suppose I now form the compound event 
‘the electron goes through hole A and the electron arrives at X’ and suppose 
I ask for its probability. You will notice that this event is a properly formed 
and intelligible conjunction of the ‘hole A’ event from the barrier space 
and the ‘arrives at X’ event from the receiver space. It is, however, an 
event that occurs in neither space and, therefore, there is no way to use the’ 
probabilities already assigned, together with the calculus of probabilities, 
in order to determine the probability of this compound event. One must be 
very clear on this point: the calculus of probabilities enables one to trans- 
form, within a given space, probabilities that have already been assigned. 
That calculus does not assign probabilities. It satisfies; rather, a very strict 
conservation principle: it neither destroys probabilities nor does it create 
them. 

If we want the probability that the electron goes through hole A and ` 
arrives at X we shall have to set this event in an appropriate sample space. 
The space we require is just the four element product space of the barrier 
and receiver spaces. Its events are the pairs 
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(4, X): ‘the electron goes through A and arrives at X’ 
(B, X): ‘the electron goes through B and arrives at X’ 
(A, ~ X): ‘the electron goes through A and does not arrive at X’ 
(B, ~ X): ‘the electron goes through B and does not arrive at X’. 


To assign probabilities in the product space the four marginal probability 
conditions must be met. 


Pr (X) = Pr (A, X) + Pr (B, X). (i) 
Pr (~ X) = Pr (A, ~ X) + Pr (B, ~ X). (i) 
Pr (A) = Pr (A, X) + Pr (A, ~ X). i) 
Pr (B) = Pr (B, X) + Pr (B, ~ X). (io) 


We can look upon these conditions as a system of four equations in the 
four unknowns that are the joint probabilities on the product space. 
These are linear equations and elementary calculations show that they have 
no unique solution, indeed that there are infinitely many distinct solutions. 
Thus as we have just emphasised, the calculus of probabilities is not 
endowed with the resources for extending the probabilities from the 
barrier and receiver spaces to determine a probability function for the 
product space. Therefore if there is going to be a probability function on 
the-product space we must look for it outside the domain of the theory of 
probability. But we should search in vain.t 

For if there were any joint probability function defined on the product of 
the barrier and receiver spaces, then the first of the marginal probability 
conditions, (i), would yield that the probability for an electron to arrive at 
X is the sum of the probability that the electron passes through A and 
atrives at X plus the probability that the electron passes through B and 
arrives at X. In that case, however, if one grants the analysis of section 1, 
the probability for arrival at X would correspond to the additive pattern 
and not, as quantum theory actually assigns it, to the interference pattern. 
Thus if we pay attention to how quantum theory handles the two-hole 
experiment we can detect a fallacy in the argument from the hypothesis of 
mutually exclusive passage to the additive pattern. It is the unstated 
assumption that the probability is well defined for the compound events of 
passing through A and arriving at X, passing through B and arriving at 
X, etc. (Or, what amounts to the same thing, that the conditional prob- 
ability Pr(X/A) is well-defined.) There is, however, no reason a priori for 
, these probabilities to be defined at all, and in quantum theory they are not. 


1 The point is that the so-called ‘joint distributions’ are the exception rather than the rule 
in quantum mechanics. For a demonstration of this together with a careful formulation 
of when the joint distributions do exist see Nelson [1967], Section 14, theorem 14.1, 
p. 117. See also Cohen [1966]. 
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I shall argue that this situation reflects no departure from ordinary prob- 
ability theory; moreover, since it is the probabilities that lack definition and 
not the conjunctions of events, there is no unorthodoxy with regard to the 
logic of events either. The situation does, of course, reflect—in the sense 
that it results from—those special features of the microcosm that are built 
into quantum physics. 

My suggestion, then, is that we can stick with common sense in affirm- 
ing the hypothesis of mutually exclusive passage, we can retain the dis- 
tributive law of ordinary logic and indeed all of ordinary logic itself, we 
can hold onto the classical concept of probability, and we can relieve the 
tension felt among these elements provided we admit that certain compound 
events are outside the domain of the probability functions of OM. This is 
a suggestion that works, but the question remains as to whether it is not 
itself too drastic. Does not the fact that OM excludes certain well-defined 
events from the domain of probability show that the use of probability in 
QM is somehow suspect? I want to argue, no, that the ordinary concept of 
probability contemplates just such a use. 

There is a certain amount of evidence ready at hand which points this 
way. There is the fact, already illustrated, that the marginal probability 
conditions are insufficient to determine the probabilities on the joint 
space. Thus the existence of the compound probabilities cannot be de- 
rived from the probabilities already assigned in the factor spaces together 
with the theory of probability. In limiting relative frequency models of 
probability there are examples of events (i.e. zero-one sequences) A, B 
each of which has a probability but for which the compound event A ^ B 
has no probability.) In the more primitive framework of comparative 
probability there are spaces each of which, separately, admits a probability 
measure compatible with the comparative order, but for which the product 
space admits of no such measure.? And, finally, there are plausible ex- 
amples where each of a pair of events may be assigned a probability but 
where we should be reluctant to assign probability to the compound event. 
This might occur, for instance, where the bases for assigning the individual 
probabilities cannot be reasonably combined.* All this evidence points to 
situations in which the ascription of probability to a well-defined and pos- 
sibly occurring event is inappropriate. I should like to develop this idea 
further. 


1 See Mehiberg [1968]. 

* See Kraft, Pratt and Seidenberg [1959]. A general existence theorem is contained in * 
Section 4 of Scott [1964]. (I owe these references to Terry Fine, whom I also want to 
thank for useful discussions of all this.) 

3 See Hajek and Sidek [1967]. 
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3 CHANCE 


The idea is that the assignment of probability in a given situation is 
dependent on features of the situation, in the sense that were these features 
not present then no question of assigning probability would arise. I shall 
refer to these features as chance features and to a situation that has them as 
a sttuation of chance. Probability is generally assigned to events. But in 
most scientific contexts, as well as elsewhere, one can take an event as 
what happens when an object comes.to have (or to ‘display’) a property. 
Thus we speak of the probability for an atom to decay, to gain momentum, 
to arrive at a certain location—etc. In each case I would say that the ascrip- 
tion of probability presupposes that the atom is in a situation of chance for 
the display of the property in question. Similar chance situations lie 
behind ascribing probability to the winning of a race by a runner or to the 
display of a certain number by tossing a die. 

If I am correct in thinking that a primitive notion of chance lies behind 
our use of ‘probability’, then it may be useful to try to spell out this 
primitive notion in order to identify those features on which probability 
+ depends. Roughly, I would say that a situation is one of chance for the 
display of a certain property just in case it is possible both to display the 
property in that situation as well as not to display the property. That is, 
¢ is a situation of chance for the display of a property P iff it is possible 
that there exist objects O, O’ both of which are in the situation ¢ but such 
that when O is tested for the display of P it displays P whereas when O” is so 
tested it fails to display P. And the connection between chance and prob- 
ability is just this. If ¢ is not a situation of chance for the display of a 
property P then one cannot correctly ascribe to any object in the situation 
$ any probability for P, other than o or 1. If in the given situation it 
would be incorrect to ascribe a probability of either o or 1, then no prob- 
ability can be ascribed. Thus in a perfectly standard way probability 
presupposes chance. 

It may seem that this way with ‘chance’ is too crude, for it captures only 
the surface feature as to whether the property is or is not displayed under 
certain tests. But the analysis even at this level is fine enough for us to pose 
and to answer the question that bears on the use of probability for compound 
events. Suppose ¢ is a situation of chance for the display of P and is also a 
chance situation for Q. The question is whether it follows that ¢ is likewise 
* a situation of chance for the display of the conjunctive property (P A Q).3 
1] do not intend any commitments here to ‘conjunctive’ properties, as opposed to 

‘ordinary’ ones. Nor do I intend to imply what is clearly false; namely, that if each of the 


predicates P and Q denotes a property then. so does their conjunction (e.g., take the 
conjunction of ‘is a liquid’ with ‘is scratchable’). Thus any thesis that commits one in 


. 
e 
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What is required is that there may be a pair of objects each in situation ¢ 
but such that one of them displays (P ^ Q) after an appropriate test 
whereas the other one does not. This required condition, however, does 
not follow from the assumed separate chances for P and for Q. For these 
separate chances may correspond to cases where P but not Q is one test 
result and Q but not P is another. The realisation of these possibilities, 
however, would not yield the joint occurrence of P and Q, and hence 
would not support that ¢ is a chance situation for (P ^ Q). 

The point of the calculation is this. If probability presupposes chance 
and if chance need not attach to conjunctions whenever it attaches to the 
conjuncts, then neither does probability. (This assumes, of course, that in 
some of these situations where the conjunction has neither probability 
o nor 1 probability does attach to the conjuncts.) Thus when we view 
probability, so to speak, from its roots we come to expect gaps. We come, 
that is, to expect that there will be uses of probability that leave certain 
well-defined events out of the domain of the probability assignments. This 
is my first line of approach to the situation in QM. My second line is to try 
viewing probability from its flowering heights. 


4 STATISTICAL THEORIES 


The heights of probability occur when that notion is introduced in the 
context set by an already flourishing scientific theory. The events to which 
probability are assigned here can generally be cast in the form of events 
which occur when an object in a state specified by the theory takes on a 
certain value of a quantity of the theory. For example, the event which 
occurs when an excited atom decays, or when the energy of a system of 
particles reaches a certain value. We can look at the theory itself in the 
following way. The theory treats a certain domain of objects; it specifies 
admissible states for these objects as well as possible quantities. Each 
object in a specified state then takes on a certain value for a given quantity. 
Within such a context there is a special way of introducing probability 
which seems to function for some as a paradigm for how to do it. I want to 
examine this way, for the belief that it is a paradigm seems to have misled 
people about what are necessary features in the use of probability. The way 
is modelled on statistical mechanics. One takes the entire set of states 
specified by the theory as a phase space. One makes a classical probability 
space out of this by introducing an appropriate o-algebra of subsets of the | 





general to the probability of compounds is surely false. The argument here seeks to 
waivé these questions about the existence of compounds and shows that even when this 
existence is not itself problematic there are still restrictions on how ‘probability’ may be 
deployed. 


e 
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phase space and by defining for each state ¢ a probability measure P# on 
this c-algebra. Each quantity Q of the theory can then be construed as a 
random variable on this probability space, with a distribution P$ obtained 
from the probability measure P* according to the usual formula: 


P$ [A] = PQA). 

The effect of this way of introducing probability is to get for each quantity 
Q and for each state ¢ a probability measure P$ on the Borel subsets of 
the real numbers. Intuitively, what one gets is the probability for each 
state œ that the quantity Q is confined to A, for any Borel set A. One also 
gets, however, a side effect. For since, in each state, the quantities are 
random variables over a common space it follows that there is a joint 
distribution for each pair of quantities. Thus this way of introducing 
probability forces out probabilities for all possible compound events. 
Small wonder that the attempt to treat probability in QM as though it had 
been introduced in this way has failed. For we know that in QM not all 
compound events are assigned a probability. This failure, however, does 
not mark any deviation from the conceptual basis of classical probability 
theory. It marks, merely, too restrictive a setting for the ordinary prob- 
abilities of OM.2 

There is a more general setting.* Simply introduce for each state ¢ and 
quantity Q a probability measure P$ on the Borel sets which bears the 
intuitive reading previously introduced. It may happen that certain of the 
quantities are interpretable as compounds of other quantities. In such 
cases this way of using probability will assign probability to the corre- 
sponding compound events. When this happens, of course, it will be a 
result of the theory. It will not follow from some extra-theoretical pre- 
scription for how to use probability; for example, the prescription that 
quantities be construed as random variables over a common phase space. 

One can readily see that QM follows this general setting. In QM the 
measures on the Borel sets are introduced directly by means of Born’s 


Rule 
P$ [A] = (q(A)¢, $), 

where g is the spectral measure associated with Q and (, ) is for inner 
product on the appropriate Hilbert space. Joint probabilities exist (roughly) 
just in case the corresponding operators commute, in which case their 
product is well defined and the compound probabilities follow from the 
- probabilities assigned to the product as above (see footnote 1, p. p.9). 
1 Cohen [1966] is a good example of how this restrictive phase space setting seems to have 

dominated the vision of investigators in QM and kept them from seeing the variety and 


range of ways in which probability is actually used in science, 
2 For more details of this setting see my [1971]. 
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It may be useful to illustrate the differences here by means of very 
simple examples. Suppose we are interested in just the quantities height 
(Q,) and weight (Q,) for the individuals of a given finite population. We 
could make a ‘phase space’ consisting of the population itself. Each subset 
S of the population will be measured by a probability P defined by 

P(S) = cardinality of S f 

cardinality of the population 
Then if A is an interval of numbers, the probability Po, [A] that Q; is 
confined to A is just 
Po, [A] = PI% | Q: (x) € A}]- 

And in this way of proceeding we can get easy answers to questions about 
the probability that the height lies in a certain range and the weight in 
another. 

This way of proceeding is no doubt useful for a stable population that 
can be examined or sampled so as to determine the universal measure P. 
Suppose, however, that the situation is different. Suppose our population 
is that of some ancient city over a specified time span and that the only 
data available to us are obtained from records which in alternate years 
record just one of height or weight for each inhabitant. From this finite set 
of books we could readily compile separate height and weight distributions. 
But if asked for the joint distribution how could we respond? Only, I 
would suppose, by making some hypothesis about the ways these quantities 
vary in time and then interpolating on the basis of such a hypothesis for 
the missing years. If we had no way of checking out such a hypothesis or if 
we had positive reasons to believe that no such interpolations were correct, 
then I suppose we should simply back off any attempt to impose a joint 
distribution. Thus we should display probability as in the more general 
setting. But we should scarcely be tempted to hail this as the discovery of a 
new kind of probability or of an old kind oddly used. The same is true in 
the case of OM. 





5 THE GENERALISATION 


I have tried to argue that common sense and ordinary logic and classical 
probability all sit happily with one another in the quantum domain (as 
well as elsewhere). My argument is this. The tension among these elements 
is generated by a certain restrictive use of probability, the use which re- 
quires that the probability of compounds must follow from the probability ` 
of components. I argue that this restrictive use is not built into the pre- 
suppositions of probabilistic discourse, nor into the classical theory of 
probability, nor is it part of a general framework for the applications of 
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probability in science. Thus we are free to discard this restriction and can 
thereby restore harmony. 

Specifically, in the case of the two-slit experiment, my suggestion is 
that we can consistently accompany the quantum mechanical analysis of 
that experimental arrangement with the following story. Each electron 
leaves the source, passes beyond. the barrier by going through exactly one 
of the two holes, and arrives then on the detecting screen. The events of 
interest, that an electron passes through a hole, that it arrives at a partic- 
ular location on the detecting screen, and that it does both of these (one 
after the other) are all well-defined and connected logically in the classical 
way. That is, that the last of these is true just in case the first two are true 
(in sequence). The probability, in the classical sense, for the first two 
events is likewise well-defined and it is determinable from the quantum 
theoretical analysis of the experiment. If that quantum theoretical analysis 
is correct, however, then the probability for the compound event is not 
well-defined. Indeed, the foundational results about the non-existence of 
so-called ‘joint distributions’ show that there is no uniform way of defining 
that compound probability which would mesh consistently with the other 

` probabilistic assignments of quantum theory. Thus the quantum theory 
enables us to find out about interesting features of our environment; in 
particular, that certain well-defined situations do not admit of probabilistic 
assessment, Given the view of probability that I have outlined, one would 
expect that there are such situations to be found. 

Let me now try, as I suggested in the introduction, to generalise this 
harmonious picture. Suppose we are dealing with the quantum theoretical 
analysis of some given system. The theory associates with the system a 
Hilbert space and, in a familiar way, a set of states, a set of quantities and 
with each state ¢ and quantity Q a probability measure P§ on the Borel 
subsets of the real numbers. This measure bears the intuitive reading 


P$[A] = the probability that in state ¢ the quantity Q is confined to 
the set A. 


I do not want to rehearse the details here, but it is important to recognise 
that the development of these probability measures employs the theory of 
spectral measures and makes critical use of the orthocomplemented 
lattice structure (and hence the inner product properties) of the Hilbert 
space. That is to say, that using the probabilistic assignments of the theory 
‘already presupposes use of its mathematical formalism. 

Looked at from a logical perspective, the theory is concerned with a 
` family Z of propositions, each of the form 


‘quantity Q is confined to (Borel) set A’. 
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I shall represent this by 
al 4]. 
Each state ¢ then determines an assignment of probability P* to each 
proposition given by 
PsA), 
as above. From this perspective my harmonious picture can be sketched 
as follows. 

Look on the family Z as the atomic propositions of a propositional 
language ¥, by introducing the usual connectives (~, A, v ) and requiring 
that Y be the minimal extension of ¥ to be closed under these connectives. 
So, now I can say in Æ that a quantity is not confined to a certain set, or 
that pairs (triples, etc.) of quantities are each confined to certain sets, and 
so on. My suggestion with regard to the classical nature of the logic of OM, 
is that consistent with the truth conditions forced out by the probability 
assignments of quantum theory are bivalent truth-value assignments, assign- 
ing truth or falsity to each atomic proposition in ¥ and extending in the 
classical way to valuations yielding truth-values for every proposition in Z. 
If we think of such a valuation as a possible state of the world, then the first * 
component of my picture is that QM allows that each of the propositions 
generated by the theory is either true or false (exclusively) in every state of 
the world. The second component of my picture has to do with assignments 
of probability to the propositions of 2. It is the claim that in assigning 
probability QM does not make use of all of Z. Each state in QM determines 
a maximal assignment of probability to a proper subset of the propositions 
of ¥ in an entirely classical way. But not only are there, corresponding to 
each probability assignment, some propositions outside the domain of 
that assignment, there are in addition some propositions not assigned a 
probability by any assignment. Propositions asserting simultaneous 
precise values of position and momentum are of this latter sort, as are 
those asserting a series of positions for a system (e.g. as in the two-slit 
experiment: passes through hole A and then arrives at X). Given the 
classical and determinate nature of ¥ and the foundational results showing 
that the probability assignments of QM cannot be consistently augmented 
so as to cover all the propositions of ¥, it seems to me entirely reasonable 
to conclude in this general setting, as I did in the case of the two-slit 
experiment, that QM teaches us that certain physically realisable situations 
are not amenable to. probabilistic assessment. . 

Now, I think this picture of QM is indeed a pretty and harmonious 
one. It rests, however, on two unsubstantiated consistency claims: (1) that 
Z admits of bivalent valuations consistent with QM and (2) that #-plus- 
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valuations admit of just those probability assignments that OM does in 
fact make. Moreover, if one puts these claims together they appear to 
represent QM as a merely statistical theory of a determinate domain. In 
that case one might suspect that the statistics can be reduced by intro- 
ducing hidden variables, or the like. But a variety of results in the literature 
seem to show that such a procedure is impossible. Hence the objection 
arises that the substantiation of (1) and (2) will run afoul of these no- 
hidden-variables results. I shall show that this is not so. 


6 ADMISSIBLE VALUATIONS! 


The consistency required by claim (1) is straightforward. QM assigns 
probabilities 1 and o to various propositions in certain states and we want 
the corresponding valuations to insure that these propositions come out, 
respectively, true and false. This apparently simple requirement is from 
the point of view of OM already quite complex. For, as I have emphasised, 
the probability assignments of QM involve the whole mathematical 
apparatus of Hilbert spaces. In particular the condition just posed requires 
that the valuations follow the location of the state vector among the sub- 
spaces of the Hilbert space as well as its orthogonality relations to the 
other vectors of the space in so far as these issue in o or 1 probabilities. 
Although this consistency requirement is already a strong one it is not, by 
itself, sufficient to insure that the valuations fit closely with QM as usually 
understood nor to insure that the propositions can bear the interpretation 
as confining some quantity to some set. This latter requirement can easily 
be built in by constraining the valuations to reflect the relevant portions of 
the algebra of sets. The problem of constraining the valuations so as not to 
be wildly out of line with ordinary QM is more difficult to come to terms 
with. For clearly in so far as we interpolate values which ordinary OM 
eschews, for instance in tracing out the path of electrons through the two- 
slit apparatus, we must already go beyond QM in the assignments of truth. 


1 For the remainder of this paper I shall assume that the quantities have pure, discrete 

spectra (which may be degenerate). The problem of specifying conditions for admissible 
valuations if one treats quantities with continuous spectra is delicate and requires 
attention to topological features of the real line. I think that paying such attention here 
would only obscure the exposition. I would just point out that the conditions about to be 
specified are completely unreasonable in the continuous case. I owe the idea for for- 
mulating such conditions on admissible valuations to M. Friedman and C. Glymour 
[1972]. They do it differently, by treating the valuations as mappings from subspaces of a 
Hilbert space to {0,1}. I think this obscures the connection to the propositions of the 
anguage, and so makes it difficult to assess the reasonableness of their conditions. In 
particular the condition they take as asserting that every observable has a precise value 
(that exactly one vector out of every orthonormal set gets assigned truth) is unreasonably 
strong. See my requirement (f) below and the discussion of Gleagon’s Theorem in 
Section 7. 


B 
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There is, however, a regularity condition which seems to me natural and 
which will keep things somewhat in line. It concerns what Mackey calls 
questions. These are the quantities represented by the projection operators. 
They each have only two possible eigenvalues, o and 1. If Py, is the oper- 
ator projecting on the closed subspace M then it takes eigenvalue 1 in a 
state ¢ just in case ¢ is in the subspace M. Thus each quantity Psy can be 
interpreted as asking whether the state vector lies in that subspace M of the 
Hilbert space, and as having the value 1 just in case the answer is ‘yes’. 
These projection operators play a fundamental role in the theory, for they 
effectively determine the probability assignments. As a regularity require- 
ment, then, I propose to make the valuations give this interpretation to 
these questions. These considerations are embodied in the following 
requirements. 

(a) To each state ¢ there corresponds a function v* from Z onto 

fo, 1}. 

(b) If PA] = 1, then of[A] = 1. 

(c) If vg[4] = 1 and A ¢ B, then v$[B] = 1. 

(d) If A is the complement of A, then 

v§[A] = 1 only if of[4] = o. 
(e) If Py is the operator projecting on the closed e RN M then 
obu [{r}] = 1 iff Pe M. 

There is a further condition of interest, but before giving it some 
comments are in order on the conditions already stated. Notice that if 
P§[A] = o then v§f[A] = o. For P§[A] = o iff P#[A] = 1. By (b), this 
implies that vf[A] = 1. But by (d) this yields of[4] = o. Thus (a) and (b) 
are sufficient for the requirement that the valuations v? be consistent with 
the o and 1 probabilities of QM. By virtue of (c) and (d) the valuations 
reflect the interpretation of the propositions as confining the quantities to 
sets. The regularity condition is given by (e), so that a question as to the 
location of the state vector is answered affirmatively just in case the state 
vector is indeed in the right place. 

Conditions (a) through (e) specify the admissible valuations for the 
propositional language 7. There is, however, a further condition worth 
looking at. 

(f) To every quantity Q and state ¢ there corresponds some real 
number x such.that 
ofl {}] = x. 


l Intuitively, (f) says that in every possible state of the world each quantity 


has a precise numerical value. If an admissible valuation satisfies (f), I 
1 Mackey [1963]. 
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- shall call it atomic. Notice, that if v* is atomic then it makes sense to talk 
about ‘the value of Q in ¢’. For if o§[{x}] = 1 then of[{#}] = o. So if 
y # x, then {y} S {%}; and hence it follows from (c) that v§[{y}] = o. 
In this case we have that 
vA] = 1 iff xeA, 

for all Borel sets A. Hence for an atomic o it follows that 

v$ [A U B] =1 iffvg [A] = 1 or vf [B] = 1; 

v§ [A n B]=1 iff of [A] = 09 [B] = 1; and 


vg [A] = x iff of [A] = o. 
Thus if v is atomic the map (for each fixed Q) 
l A-v9 [A] 


is a homomorphism from the o-algebra of Borel sets to the Boolean 
algebra on {o, 1}. 


7 ATOMIC VALUATIONS 


Are there any atomic valuations? There is a plausible line of argument, 
connected with the no-hidden-variable proofs, which would seem to rule out 
the-possibility of atomic valuations. Although I shall shortly actually specify 
some atomic valuations, I want first to consider how this plausible line of 
argument can be by-passed. To that end it is instructive to begin by con- 
sidering what an atomic valuation amounts to in the case of the two-slit 
experiment, for this will point up just what additional constraints on the 
admissible valuations might be reasonable. 

Consider the state $ of an electron at the time ¢ when, intuitively 
speaking, it is about to cross the barrier imposed by the plate containing 
the two holes. At this time ¢ is a superposition involving the (approximate) 
position eigenstates $4, $g corresponding to passage, respectively, through 
hole A and hole B. We can represent the (approximate) position of the 
electron by an operator Ọ whose spectral resolution would be given by 


Q=A PytA Pst... 
where the lambdas are the (approximate) position eigenvalues correspond- 
ing to the (approximate) position eigenstates, and where the P,, Pp are 
projections on to the subspaces spanned respectively by these eigenstates. 
(The terms omitted correspond to the remaining cells in the grid of approxi- 
* mate positions.) According to the hypothesis of mutually exclusive 
passage it should be true at time £ that the electron is localised around one 
of the holes, say hole A for the case at hand. In order to get the interference 
and not the additive pattern, however, it should be false that ¢ is an 
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eigenstate of (approximate) position around a hole. For otherwise, the 
state later on would not correspond to the right superposition. Thus in 
order to carry out the programme required by the story we want to tell 
for the two-slit experiment the valuation v* associated with state 4 must 
yield of[{A,}] = 1 and vp [{1}] = o. In other words, implementing our 
covering story involves asserting that a quantity (here Q) can have a value 
(A,) although it is not true that the state of the system is the corresponding 
eigenstate. I have argued elsewhere that by thus abandoning the strict 
eigenvalue-eigenstate link we can resolve the general measurement 
problem in QM.1 I want to emphasise here that breaking the link is an 
essential ingredient in the story of exclusive passage that motivates the 
introduction of valuations. Thus any restriction on the valuations which 
would reinstate the eigenvalue-eigenstate link will rule out the possibility 
of covering an experiment like the two-slit one with a story like the one of 
exclusive passage. Hence any such restriction on the valuations is un- 
reasonable, for it would defeat the very purpose for which the valuations 
have been introduced. 

There is such a restriction involved in the no-hidden-variable proofs. 
And the point of the preceding discussion is to argue that the restriction 
ought to be jettisoned and, that thereby, the conclusion of these proofs can 
be ignored. The restriction goes this way. In QM if Q is a quantity and fa 
Borel function we can define a quantity f(Q) by requiring that 

Pil] = PEF (A) (*) 
for all states ¢ and Borel sets A. This definition accords with the usual 
operator calculus so that if Q = £A,Q, is the spectral resolution of Q we 
have f(Q) given by f(Q) = 2f(An)Q,- According to this definition, then, the 
propositions ,g)[A] and $[f-(A)] have precisely the same probability in 
all states. Now, of course, it does not follow from the fact that a pair of 
events have identical probabilities for their occurrence that the events 
themselves somehow conspire always to co-occur. But clearly if they did 
so conspire the world would be in harmony. Thus the suggestion arises 
that we move from the same probability for these propositions to the same 
truth conditions. That is we are asked to restrict our valuations so that 

vko [A] = vA] (**) 
for all states $ and Borel sets A. 

Although the probability assignments of QM do not require that the 
truth of propositions involving functions of quantities are related as in ° 
(**), it might be thought that (**) is plausible on independent grounds. 
For if we have an atomic valuation then one might reasonably argue that 

1 Fine [1973]. 
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the values of f(Q) are simply obtained by applying f to the values of Q. So 
if it is true that f(Q) has a value in some set A, then the value of Q must be 
such that f applied to it falls in A; i.e. the value of Ọ must be in f(A). 
This, however, is no more than what is required by (**). Hence this re- 
quirement seems plausible—indeed inevitable for an atomic valuation. 
This line of argument, however, rests on the assumption that the quantum 
mechanically defined operator f(Q) is just the result of normal function 
composition; that is, that f(Q) is just the composition of f with the function 
that associates with each state ¢ the value of QO in ¢. If we revert back to 
the two-slit experiment, however, then we can see that this assumption is 
neither required by the OM account of f(Q) nor is it consistent with our 
realistic covering story. 
Consider the Borel function f defined by 


ie) = { 


where A, is the A-hole eigenvalue from the two-slit experiment. It follows 
from the spectral resolution of Q that f(Q) = P,. (I assume that distinct 
approximate positions get distinct eigenvalues—t.e. that there is no de- 
generacy.) If we try to trace here the preceding line of argument we can 
see where it falters. First notice that 


vto [{t}] = vf [iy = 0, by (e). 
Since f! [{1}] = {A,}, we have 
v [f-*{1}] = v §[{A,}] = 1. Thus 
vd) [{x}] 4 og LAS), 


in violation of (**). Now as above the value (in $) of Q is A,. What is the 
value (in ¢) of f(Q)? If v* is atomic, we have 


vh [{o}] = 1. 

So, the value of f(Q) is o. But f(A,) # 0; ie. f(the value of Q) is not equal 
to the value of f(Q). Thus f(Q) does not correspond here to ordinary 
function composition and the condition (**) is violated. Indeed to require 
(**) here would be precisely to require that the electron pass through a 
hole just in case its state is the corresponding eigenstate. Thus (**) 
embodies the very eigenvalue-eigenstate link that our covering story is 
designed to overcome. 

Thus the concurrence of truth values required by (**) is neither a 


I,x=d, 
0, x # Ay 


‘consequence of the probabilities assigned by QM nor is it reasonable on its 


own, for it reinstates the eigenvalue-eigenstate link. It would be unréason- 
able to impose (**) on the requirements for admissible valuations, and I 
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shall not do so. In this way I avoid the consequences of the no-hidden- 
variable proofs. For Kochen and Specker have established a result which 
implies that were (**) required then there would be no atomic valuations.+ 
By not requiring (**), we make the Kochen and Specker result inapplic- 
able. That result is, however, just a special case (the case where the state 
space is three-dimensional) of a deep theorem due to Gleason.? This 
theorem has the following corollary. There is no function f from the unit 
sphere of a Hilbert space into {o, 1} such that X/(¢,) is the same constant 
for every orthonormal basis {¢,} of the space. Now, however, it might be 
thought that if there were an atomic valuation v* then we could define 
such a function f by setting - 


F($) = 8,4, LI) for 
every ¢’ on the unit sphere. Then, so goes the argument, since every 
quantity has a unique value in the state ¢ the quantity 
Q = dn Praga 

associated with an orthonormal basis {@,} has a unique value in ¢ (where 
we choose any nice numbers for the 2,). If the value is A,, say, then 
F(¢x) = 1 and f(¢,) = o for j # k. So the sum of f over any orthonormal 
basis is precisely 1. If this line of argument were correct, it would follow 
from the corollary of Gleason’s theorem that there are no atomic valua- 
tions. We can see, however, that the argument depends on the assumption 
that if Q has the value A, then P,,,; has the value 1. It depends, that is, on 
the eigenvalue-eigenstate link, which our admissible valuations do not 
incorporate. We can see, moreover, that the regularity requirement (e) on 
the valuations rules out this use of Gleason’s theorem (as well as the three- 
dimensional case constructed by Kochen and Specker). For consider an 
orthonormal basis {@,} which does not include the state ¢ (or any multiple 
of $). It follows from (e) that f(a) = o for each n; so on such a basis 
2f(¢,) = 0. There are, however, orthonormal bases which do contain ¢ 
among the basis vectors. On such a-basis f takes the value o everywhere 
except at ¢ itself, and f(ġ) = 1. Hence the sum of f over the basis vectors 
here is equal to 1. Thus condition (e) prevents f from satisfying the hypo- 
thesis of this corollary of Gleason. The conclusion, which would rule out 
atomic valuations, is therefore inapplicable. 


1 Kochen and Specker [1967], Theorem 1 (see the Remark). 
t Gleason [1957]. 


3 Van Fraassen [1973]. also deals with the no-hidden-variable proofs as a barrier.to a , 


semantics for the language of QM. His strategy is different from mine, for he simply 
denies that probabilities individuate quantities. While this is a possible strategy it seems 
to me to lack direct physical support. Another point of difference is in the treatment of 
questions, as marked by my regularity requirement (e). Nevertheless, there is consider- 
able overlap between his general approach to QM and mine. 


« 
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This discussion of the inapplicability of the no-hidden-variable proofs 
brings out an important feature of the valuations. For the following pro- 
position has been established: if v? is an atomic valuation under which the 
value of quantity is an eigenvalue, then there are quantities Q such that 
vé [{x}] = 1 for some x and yet vf. [{1}] = o where Q’ projects on; the 
subspace of eigenstates of Q with eigenvalue x. Thus we have established 
that such atomic valuations must break the eigenvalue-eigenstate link. 
The proof of this is just the argument sketched above in connection with 
Gleason’s theorem. Since this feature of the valuations is presupposed by 
the sort of story we want to give for the two-slit experiment, the demon- 
stration of this feature provides reason to think that the conditions laid 
down for the admissible valuations generalise that story in the proper way. 


8 SOME ADMISSIBLE VALUATIONS! 


I shall construct two families of admissible valuations both of which are 
connected with quantum physics as standardly practised, one of which is 
atomic and one not. 

The first family of valuations seems to me to correspond to the Copen- 
hagen interpretation of QM cast in realistic and extensional terms. Associ- 
ate with each state ¢ of QM a valuation v* defined by 

v$ [A] =1 iff P$[A] = 1, 

for all quantities Q and Borel sets A. Thus all propositions of the form 
o[4], which are all the propositions in X, get an assigned value. The 
truth-value under vf of the compound propositions of # is then forced 
‘out of this assignment by the usual recursive rules, satisfying condition (a). 
Clearly, (b) is satisfied as well. The set-theoretic conditions (c) and (d) 
follow from the corresponding conditions for probability measures. And 
the regularity condition (e) becomes precisely the quantum mechanical 
1 I am very grateful to Clark Glymour for pointing out an error in a previous version of 

this paper. The error resulted from requiring something stronger than the condition 

(d) on admissible valuations; namely, requiring that 

v$ [4J=1 iff vf [A]=o. 
Although there are non-atomic valuations that satisfy this stronger requirement (for 
example, define uf [4]=1 iff P$ [A] > į) the interesting non-atomic valuation of this 


section does not satisfy it. The requirement, moreover, goes against the spirit of a non- 
atomic valuation. For the intuition in the non-atomic case is that ‘the value of Q’ denotes 
some extended entity. Thus v8 [4]= 1 just in case the extended entity which is the value 
of Q lies entirely in A. But this intuation would lead us to expect some cases where the 
extended value of Q overlaps both A and A. That is, we would expect cases where both 
o$ [A]=o and of [4]=o. From this intuitive point of view the strong requirement 
above is too strong and it is rather (d) that we want. For further discussion of a similar 
point see Fine [1971], Section 4. 
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rule for projections. These valuations, however, are not atomic. For 
example, if Q has a discrete and multiplicity-free spectrum, then 
vå [{x}] = 1 iff $ is an eigenstate of Q with eigenvalue 1. Hence if we 
take for ¢ a superposition of some eigenstates of Ọ with distinct eigenvalues 
then the condition (f) for atomic valuations fails for this choice of Q and ¢. 

Nevertheless this seems to me a natural and intuitive way of approaching 
QM. One possible disadvantage is that since these valuations are not atomic 
there seems no way of making sense of the notion of the value of a quantity 
(in a state). If one is restricted to point values, then of course this is correct. 
I have argued elsewhere, however, that it is useful to think of the value of a 
quantity as an entire set of numbers (see Fine [1971]). If this is allowed, 
then there is an obvious way to introduce ‘values’ here. Namely, associate 
with each quantity Ọ and state ¢ the set ®(Q), called the ‘value of Q in ¢’ 
defined by 

SQ) = N {B| P4 [B] = 1}. 

If Q has a discrete spectrum then ©(Q) is always non-empty (this i is not 
true in the continuous case) and consists of eigenvalues of Q. Using the 
fact that P [(Q)] = 1 it is straightforward to prove that 


vg [A] =1 if P(Q) = A. 


So, the proposition that Q is confined to A is true (in ¢) just in case the 
value of Q (in ¢) is contained in A. That v? is not atomic, then, just amounts 
to the fact that no matter what the state is there are always quantities 
whose values in that state do not reduce to singletons. 

According to this way of introducing valuations and values for a 
quantity, every quantity has a value in every state. In particular, for any 
system, it is true in each state that the system has both a position and a 
momentum. But these position and momentum values are sets of numbers, — 
not numbers. And it follows from the uncertainty relations that if in state 
¢ the position is within some very small interval, then in ¢ the momentum 
ig spread out over a large range. Thus if A and B are themselves very 
small intervals of numbers, then the conjunction of the propositions 
asserting that the position is confined to A and that the momentum is 
confined to B will be false in every state. It is to be expected, then, that 
joint probabilities for position and momentum fail to exist, since no state 
¢ represents a situation of chance for simultaneously sharp position and 
momentum. In general, where joint distributions fail to exist, there will 
turn out to be compound propositions in Z that are false for all the 
admissible valuations. In the case of the two-slit experiment this means 
that for an electron found on the detecting screen, it is false that it went 
through hole A and also false that it went through hole B. Of course the 
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electron’s position as it approached the barrier was a finite range of position 
coordinates, including coordinates corresponding to the two holes. But 
how it managed to cross that barrier remains, in this Copenhagen account, 
the sort of mystery that I cited Feynman on at the beginning of the paper. 
If we want an account that clears up the mystery, by supporting the hypo- 
thesis of mutually exclusive passage, we must look for some atomic 
valuations. 

Fortunately, they are not hard to find. Consider the sets (Q) which 
state ġ associates with quantity Q as defined above. Each (Q) is the 
smallest set (of eigenvalues of Q) whose probability under P$ is 1. In 
particular if Q is a projection operator, then ©(Q) is always a subset of 
{o, 1}. It is natural to think of ©(Q) as constituting the only possible 
values for Q in state ¢. To define an atomic valuation, then, is just to pick 
out one of these possible values, in accordance with the conditions on 
admissible valuations. We can accomplish this as follows. 

Consider the set R? of functions from the set of quantities Q to the real 
numbers R. For each state ¢ form the subset R, of RÌ that consists of 
those functions f such that f(Q) e Ø (Q) for each Q in Q. Clearly R, is not 
empty, for none of the sets ® (Q) is empty and each function f in R, is 
simply a choice function on this family of non-empty sets, as defined by the 
Axiom of Choice. Finally, refine the set Ry further to a subset RY, where 
R? < R, S RÌ, consisting of those function f in R, that satisfy the 
following condition: 

F(Py) =1 iff de M, for every projection operator Py. 


Notice that if $6 M, then ®(P,,) is either {o} or fo, 1}, according to 
whether or not ¢ is in the orthogonal complement M* of M. Whereas for 
de M, one has ® (Py) = {1}. If we define Xy to be either the set {1} or 
. the set {o} depending, respectively, on whether ġ e M or ¢ ¢ M, then it 
follows from these remarks that for each projection P, the set 
(®(P,,) N Xu) is not empty. The set RY, then, consists of choice functions 
over the family of these non-empty sets together with the ®(Q) for non- 
projections Q. Thus the Axiom of Choice insures that RẸ is itself not 
empty. (I have put this in such tedious detail in order to allay any lingering 
doubts about the existence of the valuations about to be defined.) We can 
now use this set of real-valued functions on the quantities to introduce 
atomic valuations. 

For each state ¢, let fẹ be a function in RẸ. Define a valuation of as 
follows: 


ofl] =r if fQ) eA. 
I shall call }4(Q) ‘the value of Q in ¢’, where I mean here point-value and 


e 
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not set-value. Thus in ¢ the proposition that Q is confined to A is true 
just in case the value of Q in ¢ is an element of A. 

The family of valuations v? so defined is admissible and atomic. Con- 
dition (a) follows as before by extending the truth-values recursively to all 
of Z. To see that (b) holds, suppose that P$[A] = 1. Then it follows 
from the definition of P(Q) that P(Q) =< A. But RÌ is defined to insure 
that for any of its functions f4 one has f,(Q) « (Q). Thus if P§[A] = 1, 
then f,(Q) € A; ie. then v§[A] = 1. Conditions (c) and (d) are obvious. 
The regularity condition (e) follows from the refining condition for RY 
Forog [1H =1 iff (Px) = 1, iff 6 eM by this refining condition 
Finally it is clear that these valuations are atomic since for every state ¢ 
and quantity Q one has 


vil) y} = I, 


- in satisfaction of condition (f). 


What is the atomic world like according to these valuations? Well, every 
quantal state ¢ gives rise to a whole range of possible states of the world, 
one for each v* definable from RÌ. According to each-of these possibilities 
every quantum mechanical quantity has a precise point-value, and each 
compound proposition attributing various values to various quantities 
is—as the situation goes—either true or false. Thus, in particular, in each 
possible state of the world corresponding to any quantum state ¢, the 
quantities corresponding to (approximate) position and momentum will 
have sharp values (and we can make the approximation as fine as we like 
for both of these simultaneously). In the case of the two-slit experiment, 
corresponding to the initial state ¢ of the electron there will be valuations 
v? according to which the electron does indeed go through just one of the 
holes and then arrives at the designated location on the detecting screen. 
Thus one can consistently cover the two-slit experiment by means of the 
story of exclusive passage. This merely amounts to saying that when the 
experiment is successfully run the right state of the world goes along with 
the initial state of the electron. 


9 PROBABILITY AND VALUATIONS 


Recall that the generalisation of the harmonious picture that goes with the 
two-slit experiment required that two consistency claims be substantiated. 
(See section 5). The first claim was that the classical propositional language 
admits of bivalent valuations consistent with QM. In this regard I have - 
shown how one can reasonably by-pass the results of Kochen and Specker 


‘as well as the implications of Gleason’s theorem and I have actually 


defined two classes of admissible valuations, one atomic and one not. 
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Thus the first claim is substantiated. The second claim was that ¥ to- 
gether with the valuations admits of just those probability assignments 
that QM actually makes. Implicit in this claim is that the valuations 
should mesh consistently with the probabilities of OM in such a way as to 
exclude compound probabilities just as QM does. It is this second claim, 
including the implicit exclusion principle, that I want to discuss here in 
connection with the valuations introduced in the preceding section. 

The non-atomic valuations of section 8 are easy to deal with. Recall that 

vé [A] =1 iff P} [A] =1. 

Hence there is just one possible state of the world associated with each 
quantum state ¢, that state in which all the propositions with quantum 
probability 1 are true and all the others are false. That, consistent with 
this, probabilities other than o and 1 can be assigned is assured by the fact 
that OM does assign them. As for the exclusion of compound probabilities 
it comes about like this. Suppose.Q,, Qs are conjugate (t.e. non-commuting) 
quantities with ranges ®(Q,), (Qa) in state ¢. If ¢ is an eigenstate of Q, 
then (Q,) = {A}, where A is the corresponding eigenvalue and 
P(Q) = {A Ag, ...}, where every eigenvalue A, of Q, occurs. Then one 
has that v§, [{A}] = 1 and that of, [{A;}] = 0, fori = 1, 2.... Hence, 


oF (Q, [{A}] and Qs [{Ax}]) = 0 
for each i, But the compound proposition represented here is not only 
false for this choice of ¢, it is false for every state ¢. Hence the only 
plausible candidate for a joint probability here would have to assign 
probability o to such a simultaneous attribution of sharp conjugate values. 
But one cannot do this consistently with the requirements on marginal 
probability, for they yield that 


P$ [{2}] = Z; Prob (oy [f{A}] and a, DA). 
If ¢ is the given eigenstate of Q, the left side here is 1 whereas each term 
(and hence the sum) of the right side would be o, Thus compound prob- 
abilities are not well-defined. 

The situation with the atomic valuations v* is more complex, for associ- 
ated with each state ¢ are yery many admissible valuations v*. If one has 
in mind the intuitive picture of a valuation as specifying a possible state of 
the world, then one might try to connect the valuations with probability by 
requiring that the probability (in ¢) for an event (= proposition) be 


* measured by the proportion of those states of the world in which the 


event occurs (t.e. of those valuations o* in which the proposition is true). 
Since in general there will be too many states to measure by simple pro- 
portions, this idea amounts to forming a classical probability space out of 
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® 
the valuations themselves and then transferring the measure on this space 
to the quantum mechanical probability for the event in question. Thus 
one is led to the following construction. 
Let V*# be the set of all admissible valuations associated with a quantum 
state ¢. Clearly we want to assign a measure to each subset of V¢ of the 


form 
{| of [4] = 1} 


for some quantity Q and Borel set A. So, let F+? be the smallest o-algebra 
of subsets of V? that contains all of these sets. Let P* be a probability 
measure on ¥*, so that the triple <V#, F+, P*> forms a classical prob- 
ability space. The consistency requirement for probability, now, amounts 
to requiring that 

P$ [A] = P* {o | of [A] = 1} (cons) 
for all states ¢, quantities Q and Borel sets A. I shall discuss this require- 
ment for the atomic valuations already introduced. 

First notice that if the programme outlined above could be carried out 
we would be able to define a joint distribution P§,9, for any pair of 
quantities Q}, QO, by setting š 

Pånoa [A x B] = P* [fot | o$, [A] = o$, [B] = 13). 

One can readily verify that the distribution so defined satisfies the con- 
ditions on marginal probability and is, therefore, a proper joint distribution. 
This should come as no surprise since the programme outlined above 
amounts to introducing, for each state ¢, an underlying phase space V# 
and treating the quantities of QM as random variables on this space. In so 
far as this programme embodies the consistency claim for probability, it is 
something of a mixed blessing to see that it cannot be carried out. 

To see this consider an operator Q that projects on some one-dimensional, 
subspace spanned by ¢’. Then if ¢ is a state function skew to ¢’; te. if 
o < | (4, ¢’) |? < 1, the regularity condition (e) implies that 

{o | o [GN = 3} 
is empty and hence that 
P4 if | v? {1} = 1}] =o. 
But from QM we have that 
PH = | (g, O18 
and we have chosen ¢ to make this number different from zero. Thus ° 
quite, generally the regularity conditions on the questions rule out the pos- 
sibility that the probabilities of QM can be obtained from the frequency or 
from any other probability measures on the valuations according to (cons). 
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The requirements on admissible valuations do not, however, rule out 
that (cons) be satisfied for all the non-questions; that is, for any Q which is 
not a projection. If we could do this, then we could get the probability for 
questions from the subspace structure of the Hilbert space and then fit 
these two pieces together so as to yield all the probabilities of OM. 

Assemble these pieces as follows. If Q is the operator Py, projecting on 
the subspace M then let 


P$ [A] = x4 (1) (Pu $, $) + xa (0) (Pu $, $)), where x, 


is the characteristic function of A, as one usually does in OM. If Q is 
not a question, then define 


Pt {ot | of [A] = 1} = P$ [A]. 
Since all the probabilities so far assigned are the usual ones of QM the 
rule connecting the probability of f(Q) with the probability of Q will 
connect the probabilities for the questions consistently to the probabilities 
for the non-questions. The only issue that remains is whether the function 
P+ defined above extends to all of F* as a genuine probability measure. 
That it does so extend follows from the fact that 


PV] = Ef Pt [fo e V | of [A] = 1 for some Borel set A}, 


which is just the usual product measure, is an extension of P+ to a measure 
P+ defined on all V of ¥*. 

Thus one treats the questions in the usual way of QM, a way that is 
clearly consistent with the regularity requirement (e) on the valuations. 
Once it is clear that the old eigenvalue-eigenstate link is to be broken, one 
can introduce an underlying phase space V>? for the non-questions and 
require consistency for them by means of (cons). This can be satisfied if 
there is a measure on F+ that agrees with P*, as defined above, on the 
base sets of this o-algebra. Thus P# embodies a set of marginal prob- 
ability conditions and the product measure P* is just one of infinitely 
many measures on #* that yield these marginals. These two pieces, the 
probabilities for the questions and the probabilities for the non-questions 
fit together coherently in accord with the rule that 


Pro [A] = Po lf? (4). 
In this way one guarantees that the probabilities intuitively derived from 
the valuations are consistent with the probabilities of OM. 

It now appears, however, that the underlying phase space for the non- 
questions will determine joint probabilities where they should not be 
determined and where, according to the literature, they do not exist. So let 
us inquire about the existence of joint distributions. 
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IO JOINT DISTRIBUTIONS 


The claim I want to defend in this section is the following. The phase 
space probabilities P*, which insure the compatibility of the probabilities 
of OM with the valuations, cannot be interpreted as yielding joint dis- 
tributions (for the non-questions) in the sense required by OM. One could 
substantiate this claim by showing that the results already in the literature 
concerning the existence of joint distributions only for commuting 
quantities apply to the case at hand. The reason they apply is that they 
involve a requirement which is weaker than the requirements involved in 
using Gleason’s theorem or the Kochen and Specker no embedding 
theorem. Thus one can by-pass these latter results but yet still fall prey to 
the no joint distribution proofs. I prefer, however, not to try to disentangle 
the literature here, but rather to substantiate my claim by approaching the 
question of joint distributions directly, within the framework already 
developed. 

Recall from the preceding section that if P* is the probability measure 
on the phase space then we seem to be able to define a joint distribution 
P$. oa for any pair Q,, Q, of non-questions as follows: 

P$, 0, [A x B] = P* [{o*| o§, [4] = 08, [B]= 1] 
One can readily see that the function P§,, 9, extends uniquely to a prob- 
ability measure on the Borel subsets of (R x R) and that it satisfies the 
marginal probability requirements. To qualify as a joint distribution, 
however, this measure must satisfy a further constraint ae by OM. 
The constraint is this. 

Let f: R X R >R be a Borel function. In order that P$ , o be inter- 
pretable as a joint distribution in QM we require that there exist a quantity 
f(Q1, Qa) such that for all states ¢, 


Pho, op [A] = Ph, o, Lf (A). (jd) 

Let us note some of the implications of (jd). First, it yields the usual 
connections between. the probability for a quantity and the probability for 
a function of that quantity. For if f(x, y) = g(x) then f(Q:, Os) = g(Q) 


and (jd) yields 
P$ @ [A] = P$, ¢ L (A) 

= F$, ¢ [g> (A) x R] 

= P$ [g7 (4), 
by the marginal probability requirements. Second (jd) insures that for any ` 
state # the possible values of f(Q,, Q,) are among those numbers obtained 
by applying f to the pairs of possible values of Q, and Q,. That is, for 
every state ¢, 


v4 


. 
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DIF (Qu Q2)] = FIP (Q1) x 8 (Qa)]. 
This follows from the fact that 


P$., 0, [P (Q) X © (Qa)] = 1. 


For by (jd) 
P$ cay, op FTP (Q1) X D (Qa)] = PS, 0 [Ë (Qs) X P (Q) = 1 


(Of course this relation between the possible values of f(Q}, Q,) and those 
of Q, and of Q, is simply the analogue for two-place functions of the 
legitimate residue for one-place functions of the Kochen and Specker 
requirement (**) of Section 7.) Third, notice that if all the quantities were 
* simply random variables over a common space, then (jd) would hold 
automatically. For then f(Q,, Q,) would be a random variable and (jd) 
would follow from the definition of the distribution of that random 
variable. It is because the quantities of QM are independently defined as 
operators on a Hilbert space that (jd) must be formulated as a separate 
requirement. Finally, I want to emphasise that (jd) is a requirement 
relating the probabilities for the values of certain quantities. Unlike the 
‘eigenvalue-eigenstate requirement, it does not directly link the values of 
the quantities themselves. (It is precisely (jd) that is assumed in the no- 
joint-distribution proofs. And indeed we can see that (jd) is distinct from 
(**) of section 7; which is the requirement assumed in the no-hidden- 
variable proofs.) 

The rationale for imposing (jd) as necessary for the existence of a joint- 
distribution derives in a straightforward way from reflecting on the in- 
tended interpretation of a joint-distribution function. For we want 
P$, oa [4 X B]to be interpretated as the probability that in state ¢, 
quantity Q, has a value in A and quantity Q, has a value in B. If P$., 9, 
can bear this interpretation, however, then one must interpret 
P§,, o, [f+ (A)] as the probability in ¢ that Q, has some value x, and Q, 
has some value x, where f(x, xg) « A. But if this is well-defined for each 
state ¢, then we can introduce a quantity f(Q,, Qa) by requiring that its 
probability (in $) for having a value in A is precisely the probability just 
given that f(x, xa) e A. Thus if P$ ọ, can be interpreted as a joint- 
distribution one can introduce a quantity /(Q,, Qa) satisfying (jd). If, 
however, P§, ọ, is to be understood as giving the distribution of quantum 
mechanical quantities, the quantity f(Q}, Q2) introduced by (jd) must be a 
quantity of OM. Thus (jd) emerges as a necessary condition for the exist- 
ence of a quantum mechanical joint distribution. Since the quantities of 
QM are given by the self-adjoint operators on a Hilbert space, it is a 
genuine question whether (jd) holds. i 
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The question is this. For any Borel function f: R x R —> R and for any 
pair Q,, Q of non-questions, does there exist a self-adjoint operator 
F(Q1, Qe) satisfying (jd). The answer is no; that necessary and sufficient 
for there being such an operator is that Q, commutes with Q,. The 
sufficiency here follows from the fact that if Q, commutes with Q, then 
there are Borel functions g,, g and some operator Q such that QO, = g, (Q) 
and Q, = ga (Q). But then we can define f(Q;, Qs) = f(e, (Q), ga (Q)). 


This is a well-defined self-adjoint operator and one can easily check that 


(jd) holds by virtue of the usual rule connecting the probabilities of a 


quantity with those of a (one-place) function of that quantity. 

To show necessity suppose Q, does not commute with Q,; and suppose 
there were a self-adjoint operator Q = f(Q,, Q,) satisfying (jd). For - 
simplicity I shall assume that f is one-one; t.e., that if <x, y> # <x’, y'> 
then f(x, y) 4 f(x’, y’). Since [QO] = f[®(Q,) x ©(Q,)] for all states 
¢, the spectrum of Q is discrete. Hence the Q-eigenstates span the entire 
Hilbert space. If Q, were to commute with Q, on each eigenstate of Q 
then Q, would commute with Q, everywhere, contrary to the initial 
supposition. It follows that there is some eigenstate ¢ of Q on which Q, 
does not commute with Q,. Let A be the Q-eigenvalue corresponding to ¢.° 
By the assumption on f there are unique eigenvalues àp, A, of Q,, Qo 
respectively such that f' [{A}] = {A,, àa}. By (jd) this implies that 

PÅ [EA] = Phy o [Pw AJI = © 
By the requirements on marginal probability one has that 
PS, [aH] = PE, og [Av Aad] + PS, ox [fk x (R — {Ao})]- 
But since P§,, 9, Hàn Ae}] = 1, it follows that 
PS, [AJ = 1. 
Similarly from 
PO, [fra] = Pb, oa [fa Aa] + Pé, 0, (CR — (a3) x fred] 
it follows that 
P$; ERS =I 


These conditions, however, imply that ¢ is simultaneously an eigenstate of 
Q, and of Q, and hence that Q, commutes with Q, on ¢, contrary to the 
way in which ¢ was chosen. Thus the assumption that there is an operator 
O = f(Q,, Qa) satisfying (jd) leads to a contradiction if Q, does not com- 
mute with Q,. : . 

I have argued that (jd) poses a natural requirement for the interpretation 
of Ps. Q #8 a distribution of quantum mechanical quantities. And I have 
shown that this requirement is satisfied precisely where ordinary QM 


vA 


Probability and the Interpretation of Quantum Mechanics 33 


allows the introduction of joint probabilities; that is, just in case of 
quantities that commute. Thus the possibility of introducing phase space 
measures P* for the non-questions does not allow for joint probabilities in 
QM beyond those already in the purview of the theory. 
Before concluding this section I should like to relate this somewhat 
_ abstract and technical argument to the more homely reflections on prob- 
ability involved in the discussion of the two-slit experiment. The problem 
posed by that experiment is to reconcile the existence of a probability for 
each of the events ‘the electron goes through hole A’ and ‘the electron 
arrives at location X on the detecting screen’ with the non-existence 
(according to QM) of a probability for the compound event ‘the electron 
goes through A and arrives at X’. If ¢ is the initial state of an electron in 
the experiment, then we may suppose that there are many states of the 
world compatible with ¢ (i.e. many valuations v*). In some of these, an 
electron goes through A and arrives at X. In others this does not happen. 
If we think of these states of the world as corresponding to tests from the 
initial state 4, then this state ¢ poses as a situation of chance for the com- 
pound event to occur. Thus the non-existence of compound probabilities 
` here cannot be traced to the failure of the test-display presuppositions. 
Moreover, since the probabilities for each of the ‘A’ and ‘X’ events can be 
thought of as measuring the extent of those states of the world in which 
that event occurs, one might suppose that a probability for the compound 
event could be construed similarly. 

Thus one might suppose that we could tag each state of the world in 
which an electron goes through hole A and tag each state of the world in 
which an electron arrives at X. Then one could separate out those states of 
the world doubly tagged and measure their extent. This line of thought 
amounts to introducing a quantity Ọ that takes only o and 1 as values, and 
that takes the value 1 just in case an electron does go through hole A and 
arrives at location X. And then identifying the probability for the compound 
event with the probability that Ọ takes the value r. That is, this line of 
thought amounts to introducing a quantity Q in conformity with the 
requirement posed by (jd). The discussion of that requirement, however, 
shows that there is no such quantity in QM with the necessary probability 
distribution. For the quantum mechanical quantities that correspond to 
location around hole A and location around region X do not commute. 
_ Hence, although the set of those states of the world in which the electron 
goes through hole A and arrives at location X is itself well-defined, OM 
dictates that its measure is not. ; 

I would describe what happens in the two-slit experiment this way. The 
initial state of the electron does establish a situation of chance for the 
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compound event. But, if OM is correct, it is not a situation of chance that 
admits a probability for that event. It is, rather, like the situation that 
obtains on the limiting relative frequency account of probability. There one 
may have a pair of zero/one sequences in each of which there is a limit to 
the relative frequencies for 1 but where in the compound sequence (in 
which a 1 occurs in just those places where a x occurs in both original 
sequences) there is no such limit. So, too, in the case of the two-slit 
experiment one can imagine that the occurrence of the compound event of 
passing through hole A and arriving at X is too erratic to admit of a prob- 
ability assessment, although the component events themselves are well 
behaved. The discussion in this section of the requirement imposed by 
QM on joint distributions leads to just this sort of view for the prob- 
abilistic assessment of compound events for the theory in general. 


Ir CONCLUSION 


In the first four sections of this essay I have sketched an intuitive and 
realistic account of the two-slit experiment as a vehicle for discussing the 
way in which QM handles probability. I have argued that this is a per- 
fectly ordinary scientific use of probability, a use entirely in accord with ` 
the classical theory of probability as well as its statistical applications. I 
identified the main obstacle to accepting this conclusion as the view that 
the concept of probability forces out the probability of compounds when- 
ever it allows for the probability of their components. I suggested that this 
view derives from a failure to appreciate the physical presuppositions 
involved in probabilistic discourse and that it leads to an unduly narrow 
conception of how to apply probability. In section 3 I sketched out a 
presuppositional account and in section 4 I outlined a general framework 
for the application of probability. QM fits both of these accounts, as do 
other scientific uses of probability. Thus the view about the probability of 
compounds is seen to be ill-founded and poses no obstacle to accepting the 
conclusion about the ordinary nature of probability in QM. 

The following picture of the two-slit experiment emerges from this 
discussion. Each electron emitted from the source goes through exactly 
one of the two holes and then arrives at some location on the detecting 
screen. I have shown that this account is compatible with the interference 
pattern building up on the detecting screen. This compatibility involves 
asserting that although all of the events involved in this picture of the 
passage of the electrons through the apparatus are well-defined, not all of ` 
them have well-defined probabilities. In particular, if QM is correct, then 
there is no probability for the event which occurs if an electron does go 
through one hole and then arrives on the screen. (Although the prob- 
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ability for it to go through a particular hole as well as the probability for it 
to reach a particular location on the screen are each well-defined.) 

I have used the two-slit experiment in order to make the discussion 
fairly concrete. Nevertheless, I wanted the view that emerged to generalise 
to QM as a whole. It appears that the appropriate generalisation involves 
thinking of QM as an essentially probabilistic theory of a determinate 
domain. There is, however, a literature on hidden variables and joint 
distributions which would seem to rule out precisely such a view of OM. 
In sections 5 through xo I indicated just what I take this generalisation to 
be and I showed how it avoids the objections posed by this literature. 

I have adopted a logical point of view and treated QM as specifying a 
class of elementary propositions—each asserting that a quantity of the 
theory is confined to some set of numbers. Closing these elementary 
propositions under the ordinary logical functors leads to a classical pro- 
positional language for the theory. The hidden variable literature now 
poses the question as to whether this language admits of ordinary (bivalent) 
truth conditions (valuations). In Section 6 I have shown how to formulate 
this problem in precise terms and in Sections 7 and 8 I have shown that 
there are indeed ordinary truth conditions available for the language of 


QM. This demonstration involves by-passing the no-hidden-variable- 


proofs. In this regard I argue that these proofs involve assumptions that 
imply an unreasonable link between a quantity having a value and the 
state of the system in which this occurs. They imply that a quantity has a 
value if and only if the state of the system is the corresponding eigenstate. 
But I show that it is precisely this link that must be abandoned if we are to 
give the sort of picture for the two-slit experiment just sketched. Thus 
the assumptions involved in the no-hidden-variable-proofs would by 
themselves rule out the sort of intuitive and realistic picture of the domain 
of QM that we hope to be able to maintain. 

Once it is clear that we can drop these assumptions and thus introduce 
classical truth-conditions for the language of OM, the further question of 
whether these conditions are consistent with the probabilities of OM 
arises. 

In sections 9 and ro J formulate this question and show that indeed the 
variable truth-conditions connect with probability as expected: they allow 
for precisely the probabilities given by QM and for nothing more. This 
demonstration involves isolating the basic assumption of the no-joint- 


* distribution proofs (requirement (jd) of Section xo), showing that it is 


indeed a reasonable necessary condition for a joint distribution and then 
proving that it leads precisely to the commutativity rule of QM. Thus I 
drive a wedge between the no-hidden-variable-proofs and the no-joint- 
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distribution-proofs, rejecting the former and accepting the latter. The 
upshot is to develop a determinate domain for QM, a domain where every 
quantity can have, in every state, a precise (point) value; but where the 
probabilities are exactly those of OM. Of course, this is just the generalisa- 
tion contemplated for my description of the two-slit experiment. 

One way of viewing the work here is to construe it as a detailed justifica- 
tion for the following view. QM treats a world populated by entities that 
have determinate spatio-temporal relations, determinate trajectories, and 
so on. If QM is true then some but not all of the events occurring in this 
world can be assigned probabilities. QM assigns probabilities wherever 
they can be assigned and relates these probabilities in the classical way. 
OM does not (except in the case of probability 1 or o) go beyond prob- 
abilistic assessment to the description of the events themselves. 

This point of view is consistent with the formalism of OM and with OM 
as practiced. It enables one to give simple, intuitive accounts of experi- 
ments in the domain of quantum physics. Thus it enables one to deal 
effectively with the outstanding problems and paradoxes in the inter- 
pretation of the theory. This point of view places QM in the mainstream 
of scientific theories by displaying a continuity between the concepts of 
classical physics and their successors in quantum physics. By establishing 
a logical setting for QM that is entirely classical, it enables one to link OM 
logically with other scientific theories. Thus with regard to both under- 
lying world-view and historical/conceptual setting I hope that the work 
here will help to dispel the aura of mystery that has long surrounded OM. 


Cornell University and 
University of Ilinois at Chicago Circle 
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Discussions 


COULD THERE EXIST A WORLD WHICH OBEYED 
NO SCIENTIFIC LAWS? 


It may seem that the answer to this question is obvious and that I am simply 
erecting a structure which I already know how to undermine, but this was not 
how it appeared, and I hope that I can convince others that this problem is not 
trivial. In discussing it I assume the existence of a language with at most a 
countable number of symbols, so that by ‘scientific law’ I mean ‘scientific law 
frameable in a finite length in the language’. I shall comment on this restriction 
later. Further, to ease the exposition, I shall assume that the outcome of a certain 
infinite series of experiments is expressible as a certain infinite series of o’s and 1’s, 
where some coding may be necessary for this expression to be possible. 

By a deterministic law I mean an effective rule for predicting the next outcome 
with the knowledge only of the previous outcomes. It is obvious that there 
exist infinite sequences of o’s and 1’s that do not obey deterministic laws, for to 

, each law there corresponds one sequence, there are an uncountable number of 
infinite sequences, but only a countable number of laws, since there are only a 
countable number of expressions of finite length in a language with a countable 
number of symbols. 

However, the situation is vastly complex when one comes to probabilistic laws, 
which may be defined as effective rules for determining the probability of the 
next outcome given the sequence of outcomes up to that point. The best known 
of these laws is that corresponding to the tossing of a fair coin, where the prob- 
ability that the next toss will give heads (perhaps coded as o in the corresponding 
sequence) is $ independently of what has occurred already. In the language of 
measure theory almost all sequences obey this law. Now the two concepts ‘obeying 
a probabilistic law’ and ‘almost all’ can be expressed mathematically, but are not 
simply explained in everyday language. The first is not essentially relevant to 
the argument, but I shall attempt to explain the second in order to demonstrate 
that the problem being considered is not trivial. 

I shall try and explain what it means to say that in almost all sequences of 0’s 
and 1’s the proportion of o’s is $. It can be shown (for instance, using the Cheby- 
chev inequality) that of the 2” sequences of o’s and 1’s, the proportion of them in 
which the proportion of o’s lies in the range $(1--n~*) is greater than 1—n~*, 
In other words as n> co the proportion of sequences in which the proportion of 
o’s differs from 4 is zero, although there are still an uncountable number of them. 
This is the intuitive meaning of ‘almost all sequences of o’s and 1’s have as pro~ 
portion of o’s 4’. I can only state that for all definitions so far produced as to what 

* it is to obey a probabilistic law, it has been proved that almost all sequences of 
o’s and 1’s obey the law “‘o’s and 1’s appear randomly, each with probability 4’. So, 
if almost all sequences obey just one probabilistic law and there are an infinite 
number of probabilistic laws, each obeyed by an uncountable number of 
sequences, it is not obvious that there are sequences that obey none. 
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. I am not going to define the term ‘scientific’ but to point out a feature which 
appears to be necessary if a law is to be so described. It is that it must be logically 
possible at any stage for experiments to yield a finite sequence of outcomes 
which are sufficient for the law to be rejected according to some principle. Now 
a deterministic law only has one rejection principle, that is if one of its predic- 
tions is wrong, but probabilistic laws have, in general, an infinite number. We 
may select a 5 percent rejection rule, or a 1 per cent, or anything in between, 
and there are many other kinds. However, for any scientific law (not necessarily 
probabilistic), there can at most be a countable number of rejection principles, 
since there are only a countable number of expressions in the language. A se- 
quence of outcomes does not obey a law if for each corresponding rejection 
principle there is a stage at which it is rejected. 

Let us suppose that there is an ordering L; of the scientific laws, and that to 
each law L; there corresponds an ordering R; of the associated rejection prin- 
ciples. We may imagine the laws L, positioned in order on a line with the 
corresponding rejection principles lying in columns below. A sequence exists 
with the following properties: it starts with a finite sequence of o’s and 1’s suffic- 
ient for the first law L, to be rejected according to the principle R,,, then con- 
tinues in such a way as to have law L, rejected by Ry, then L, by Ru. In fact 
we proceed through the lattice of rejection principles upwards and to the right 
where possible, in the order Ry, Ria Ray, Rig, Res, Rei, Rig etc. Given any rejec- 
tion principle Ry, at some stage in the sequence the law L, is rejected according * 
to it and thus the sequence obeys none of the laws. Thus it is logically poss- 
ible that a system could develop in such a way as to obey no scientific law. 

Let us now consider the original restriction that the law was expressible in 
some language. It could be said that the law ‘‘o’s occur randomly with proba- 
bility p” where p is non-computable is a perfectly good law. Now I do not know 
whether or not non-computable numbers are essentially unapproachable in the 
sense that it is impossible to exhibit them and thus name them, but I can point ` 
out one thing. If we are allowing any number to be named, then to any infinite 
sequence there corresponds a number, and therefore, presumably, the name of 
that number is a deterministic law which that sequence obeys. In other words, if 
we removed the restriction originally imposed, everything would be determin- 
istic, but in a barren sense. 

A. W. SUDBURY 
School of Mathematics, 
Bristol University 
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Review Articles 


IS THE PROGRESS OF SCIENCE EVOLUTIONARY?! 


The Programme of Toulmin’s Trilogy. 

Toulmin’s Case for an Evolutionary Philosophy of Science. 

Some Radical Flaws in Toulmin’s Evolutionism. 

The Weakness of Toulmin s Arguments against Logico- Analytical Philosophy of 
Science. 


5 Some Factual Errors in the Text. 


ANM 


1 The Programme of Toulmin’s Trilogy. 


This is the first volume of a projected trilogy, in which Professor Toulmin seeks 
to achieve a definitive statement and justification of his philosophy of science. 
The overall motif is an insistence that the rationality of science is primarily to 
be understood, not in terms of logical relations between propositions, but in 
terms of discoverable patterns in the historical evolution of intellectual discip- 
lines. The first volume is the only one at present published. It contains a general 
introduction to the whole work, together with a statement of Toulmin’s views 
about the collective use and evolution of concepts. The second volume is 
promised for 1974, or thereabouts, and will give an account of the way in which 
individuals grasp and develop concepts. The third volume is planned for about 
1976 and will discuss the rational adequacy and appraisal of concepts. 

The monumental scale on which even the first volume is constructed, the 
variety of topics it traverses, the boldness of the claims it makes, its immense 
parade of learning, and the scorn which is heaped on rival philosophies, all combine 
to ensure that it will at least be much discussed. It may even, despite its size, be 
widely read, since it is written throughout in an admirably readable style and is 
full of provocative comment. But it suffers from three rather crucial faults. Its 
evolutionary model for the history of intellectual disciplines is not entitled to be 
put forward under an aura of Darwinian respectability. It seriously overstates the 
case against the elucidation of scientific rationality in terms of logical relations 
between propositions. And it contains too many factual errors or misrepresenta- 
tions for a book that appeals as often as it does to facts about the history and 
sociology of the human intellect. Perhaps later volumes will remedy some of 
these flaws. Meanwhile one can only pay the book the compliment of criticising 
it as forthrightly as it criticises others. 

In what follows I shall first summarise Toulmin’s main argument, and then 
discuss in turn each of the three main faults that vitiate his presentation of this 
argument. 


‘2 Toulmin’s Case for an Evolutionary Philosophy of Science. 


Toulmin’s own theory will emerge most clearly if we relate it, as he himself does, 

to the positions of Frege, Collingwood, and Kuhn, respectively. ' 

1 Review of Toulmin [1972]: Human Understanding, 1. Oxford: Clarendon Press, £4.75. 
Pp. xii+ 520. 
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Toulmin discerns two polar extremes in the philosophy of human thought. At 
one extreme he locates the view that philosophers should concern themselves 
with human concepts only as timeless, intellectual ideals towards which the 
human mind struggles, at best, painfully and little by little (p. 56). Neither history 
nor psychology should, on this view, be allowed to adulterate philosophy, since 
all true knowledge must be founded ultimately on changeless, ahistorical proper- 
ties, relations or principles (p. 55). Frege’s Foundations of Arithmetic, according 
to Toulmin, was the path-finding exemplar for this modern Platonism. The 
programme Frege enunciated in the 1890s became a paradigm for Bertrand 
Russell’s earlier work and for half a century’s research on philosophy of science, 
especially in Vienna and the United States (p. 57). Within the Unity of Science 
movement the symbolism of mathematical logic became, as Euclid’s geometrical 
ideas had been earlier, the obligatory medium for expounding a coherent and 
unified scientific theory (p. 59). 

Since even the simplest human languages allow for some elementary logical 
and arithmetical operations, says Toulmin, one can see how Frege and his suc- 
cessors could regard Man’s historical conceptions of negation or number as 
gropings towards the ultimate formulation of ‘pure concepts’, and one can feel 
that they could, in these cases, safely ignore the complexities of historical and 
anthropological fact. But outside logic and pure mathematics, he holds, this 
Olympian stance is not so easily maintained, nor is history so readily escaped. 
‘In substantive and developing fields of enquiry, like dynamics and political 
theory, the philosopher’s central task is no longer to recognise how, “after 
immense intellectual effort”, humanity at last “stripped away irrelevant accre- 
tions” from the “pure concepts” in question, and so arrived at the perfect idealis- 
ations which alone have philosophical interest or authority. Rather, it is a step- 
by-step task, of recognising the considerations which justify replacing one set of 
theoretical conceptions by another within the historical sequence (weight and 
impetus by mass and momentum, polis by nation-state, rank by class) and of finding 
impartial procedures for comparing the merits of the concepts actually employed 
in different contexts’ (pp. 60-1). Indeed, in mathematics as in science, any 
attempt to judge conceptual novelties, or to make comparisons across the intel- 
lectual boundaries between rival theories, soon drives us, according to Toulmin, 
beyond the range of a purely formal analysis (p. 64). Nor will it do to appeal to 
vague, pragmatic considerations of simplicity or convenience. Especially at the 
outset, the Copernican theory was by many tests substantially less simple or 
convenient than the traditional Ptolemaic analysis (p. 65). 

At the opposite extreme to Frege’s absolutism stands the relativism that is 
typified for Toulmin by R. G. Collingwood’s Essay on Metaphysics. Since men’s 
intellectual standards have varied between different historical and cultural milieus 
in just the same way as their ethical and aesthetic preferences, the only safe posi- 
tion, in Collingwood’s view, is to concede final authority within any milieu to the 
particular intellectual standards current in it, while denying those standards any 


relevance or authority outside their original contexts. These absolute presup- , 


positions of a particular milieu are subject only to historical analysis, and not to 
rational criticism of any kind (pp. 66—74). 

Thus, on Toulmin’s view, when it comes to explaining the rational consider- 
ations justifying men’s actual transitions from one set of basic concepts to its 
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historical successor, neither Frege nor Collingwood can give us the critical tools 
we need. Frege dismissed ali such historical questions as ‘merely empirical’, and 
concerned himself only with ‘pure concepts’ in their final, perfected forms. 
Collingwood for his part recognised the importance of the question, but left him- 
self no way of answering it. Moreover instead of challenging Frege’s anti-histor- 
ical assumption—that concepts and propositions can carry rational authority only 
if they form a ‘logical system’—Collingwood merely offers an alternative account 
of the logical relations by which the elements of conceptual systems are connected 
together (p. 81). i 

Faced with the opposition between Frege’s absolutism and Collingwood’s 
relativism, Toulmin seeks to escape through the horns of the dilemma by aban- 
doning the assumption that the nature of scientific development is to be under- 
stood primarily in terms of a system or systems of logically related elements. Now 
Kuhn too has abandoned this assumption. So Toulmin leads into his own 
solution of the problem via a discussion of Kuhn’s (pp. 106 ff). He points out 
how in Kuhn’s earlier [1957] the term ‘revolution claimed no explanatory 
significance. It simply marked a profound redirection of men’s intellectual 
loyalties on the theoretical level, and cast doubt on an “‘uniformitarian” sug- 
gestion that intellectual progress in science always—and properly—depends on 
the application of a routine Scientific Method.’ Later Kuhn came to hold that 
any description of conceptual changes as ‘revolutionary’ phenomena entails the 
need to give a correspondingly ‘revolutionary’ explanation of their manner of 
occurrence. The intellectual loyalties of a school of scientists were then said 
to be organised around ‘paradigms’ and a paradigm-switch was said to involve a 
transfer of loyalty, or intellectual conversion, comparable to that involved 
in the abandonment or modification of a theological dogma. But even in 
the first edition of The Structure of Scientific Revolutions (1962), Kuhn 
claimed that, because revolutionary scientific changes were fairly rare, it 
was normal for the scientists of a particular school to work under the author- 
ity of an accepted paradigm. When critics began to cast doubt on the 
view that any scientific change had ever been as drastic and total as those 
which Kuhn called revolutions, his response (cf. the second edition, 1970) was to 
claim an unending sequence of smaller revolutions. But this response entails 
abandoning, according to Toulmin (p. 114), the central distinction around which 
Kuhn’s whole theory had been built—viz. the distinction between conceptual 
changes taking place within the limits of an overall paradigm and those involving 
the replacement of an entire paradigm. The original contrast between alter- 
nating phases in the development of scientific theory has been transformed into 
a distinction between (i) scientific arguments that involve no conceptual changes 
and can be presented in terms drawn from formal logic and (#) scientific argu- 
ments that involve conceptual or theoretical novelties and cannot beso presented. 
A historical and explanatory contrast, says Toulmin, has been transformed into 
a purely analytical distinction. Kuhn’s latest account of ‘scientific revolutions’ 


_ is no longer a theory of conceptual change at all (p. 117). 


Toulmin points out how political scientists, too, have. lost confidence in the 
explanatory power of the term ‘revolution’, since so many continuities of law, 
custom or institution have survived even the most drastic political upheavals 
(like the American, French or Russian ‘revolutions’). Instead he proposes to 
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give an ‘evolutionary’ account of scientific development ‘not just in the broad 
sense of being non-revolutionary, but in a quite precise and strict sense of the 
term’ (p. 134). To do this Toulmin thinks it unnecessary to assume that intel- 
lectual evolution has something biological about it, or even that the process of 
conceptual change in the sciences displays any substantial resemblance to the 
process of organic change. Rather his hypothesis is ‘that Darwin’s populational 
theory of “variation and natural selection” is one illustration of a more general 
form of historical explanation, and that this same pattern is applicable also, on 
appropriate conditions, to historical entities and populations of other kinds’ 
(p. 135). In particular it is applicable to intellectual disciplines, since the content 
of a natural science should be considered as a population of concepts, methods 
and aims within which there are—at most—localised pockets of logical systema- 
ticity (p. 128). 

Toulmin thinks that the general pattern of historical explanation implicit in 
evolutionary zoology can be condensed into four basic theses (pp. 135 ff.), each 
of which has a counterpart in the case of conceptual evolution. First, we need 
explain not only why organic species change as they do but also why, within con- 
tinually varying populations of living things, any such definite and discrete 
species are found at all. Secondly, both these facts can be explained in terms of 
a single process of natural selection, whereby most variations of feature put the 
individual at a disadvantage in the competition for reproduction but occasional 
advantageous novelties become established. Thirdly, this process gives rise to 
authentic new species only where appropriate environmental conditions prevail. 
In particular there must be sufficient pressure within a limited forum of competi- 
tion in order to ensure that a novel variation can demonstrate its advantages. 
Finally, variants are selectively perpetuated if and only if they are sufficiently 
well-adapted, ż.e. if they cope effectively enough with the ecological demands of 
the particular environment; and these demands are made both by the climate, 
soil, terrain, etc., and also by other co-existing populations of organisms. 

Toulmin then seeks to restate these four basic theses in terms applicable to 
conceptual development (pp. 139 ff.). First, within any particular culture or 
epoch men’s intellectual enterprises fall into more-or-less separate and well- 
defined disciplines, each characterised by its own body of concepts, methods and 
fundamental aims. ‘An evolutionary account of conceptual development accor- 
dingly has two separate features to explain: on the one hand, the coherence and 
continuity by which we identify disciplines as distinct and, on the other hand, 
the profound long-term changes by which they are transformed or superseded. 
Secondly, ‘in any live discipline, intellectual novelties are always entering the 
current pool of ideas and techniques up for discussion, but only a few of these 
novelties win an established place in the relevant discipline, and are trans- 
mitted to the next generation of workers’. So ‘this same process can account either 
for the continued stability of a well-defined discipline or for its rapid trans- 
formation into something new and different’. Thirdly, there must exist suitable 


forums of competition within which intellectual novelties can survive for long | 


enough to show their merits or defects, but in which they are also criticised and 
weeded out with enough severity to maintain the coherence of the discipline. 
Finally, ‘the disciplinary selection process picks out for “accreditation” those of 
the “competing” novelties which best meet the specific “demands” of the local 
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“intellectual environment”. These “demands” comprise both the immediate 
issues that each conceptual variant is designed to deal with and also the other 
entrenched concepts with which it must co-exist.’ 

Toulmin applies his evolutionary account to the elucidation of many different 
features in the development of science. He points out that a populational approach 
to intellectual history debars us from giving permanent definitions of the resul- 
ting disciplines, marking off different fields of enquiry by immovable boundaries 
in terms of supposedly unchanging ‘essential properties—whether methods or 
problems, theories or concepts, techniques or subject-matter. In intellectual 
history as in natural history the old philosophical ideal of permanent entities, 
which preserve an essential identity through a continuing sequence of accidental 
historical changes, can be replaced by a more lifelike notion of historical entities 
which, though possessing no absolutely unchanging characteristics, preserve 
enough unity and continuity to remain distinct and recognisable from one epoch 
to another (p. 141). Also, he thinks (pp. 142-3), any well-structured rational 
enterprise can be regarded not only as an intellectual discipline, comprising a 
communal tradition of procedures and techniques for dealing with theoretical 
or practical problems, but also as a profession, comprising the organised set of 
institutions, roles and men whose business it is to apply or improve those pro- 
cedures and techniques. The evolutionary history of a rational enterprise needs 
to treat these two aspects as being in constant interaction with one another. For 
example, in examining the selection-procedures actually used in evaluating the 
intellectual merits of each new concept, we must relate them to the activities of 
the men who form, for the time being, the authoritative reference-group of the 
profession concerned—the editors, the invisible college, the writers of authorita- 
tive textbooks, etc. In this way it is possible to illuminate both the intellectual 
and the institutional differences that exist between a would-be discipline like 
psychology or sociology (pp. 378 ff.) and a genuine one like physics or chemistry. 
But while disciplinary accounts aim at the rational appraisal of conceptual 
changes, accounts in terms of professional organisation are aimed primarily at 
diagnosis and causal explanation (p. 310). 

Obviously we must wait until the publication of volume 3 for a definitive 
statement of the way in which Toulmin proposes to elucidate the concept of 
rationality in intellectual history. But the present volume provides a substantial 
indication of his ideas on this subject. Scientific theories, he says, are formal 
artefacts, or abstractions, taken from a historically developing enterprise whose 
rationality lies primarily in its procedures of conceptual change (p. 478). In his 
view the rationality of natural science and other collective disciplines has nothing 
intrinsically to do with formal entailments and contradictions, with inductive 
logic or with the probability calculus (p. 479). Nor need it make any appeal to 
a priori demarcation criteria like those of Popper and Lakatos (pp. 480-3). 
Fundamentally, he argues, this is because ‘the establishment of new or modified 
concepts involves—in the nature of the case—non-stereotyped procedures, which 
can be expressed linguistically only in non-formal arguments, framed in terms of 
“‘meta-statements” about those novel or modified concepts’. More specifically, 
the study of rationality in scientific disciplines is the search for a full and detailed 
understanding of the features that make alternative strategies and procedural 
innovations adaptive to particular ecologically recognisable problem-situations 
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(p. 484). So some kinds of rational and impartial comparisons can be drawn be- 
tween the concepts and beliefs of men in different milieus, just as, in jurisprud- 
ence, the existence of jurisdictional boundaries never makes it wholly impossible 
to consider judicial precedents from older cases or other countries (pp. 86-9, 
491-3). To the extent that men living in different milieus have faced similar col- 
lective problems, and developed comparable collective activities—or rational 
enterprises—for tackling them in an organised manner, we can recognise those 
parallel enterprises as defining corresponding forums of judgment. Accordingly, 
Toulmin argues, within the limits defined by the respective enterprises and 
milieus, we may in each case consider—retrospectively—what was in fact 
achieved by accepting the concepts under consideration towards meeting the 
relevant demands of, say, physics or law, or—prospectively—what light such com- 
parisons throw on possible ways in which the proper goals of scientific understan- 
ding or judicial administration could be better formulated for the future. In sum . 
(pp. 502-3) scientists ‘can ground the strategic estimates on which rational 
changes of policy are based, only on a well-digested appreciation of earlier 
achievements in those same enterprises.... The burden of “rationality”... 
consists in the fundamental obligation to continue reappraising our strategies in 
in the light of fresh experience.’ 


3 Some Radical Flaws in Toulmin’s Evolutionism. 


It is easy to discern affinities between Toulmin’s philosophy and Hegel’s. Both 
insist that formal-logical analyses miss the essence of human thought. Both see 
social institutions as an outward manifestation of intellectual development. Both 
tend to imply that the rationality of conceptual change is measured by its 
acceptance. Correspondingly, it would be easy to reject Toulmin’s philosophy 
of science with some such facile criticism as that it commits ‘the genetic fallacy’, 
and confuses what ought to happen in the history of science with what actually 
does happen, since apparently it asserts that rationality in science is always to 
be determined by reference to the aims and methods of particular groups of 
scientists. But this would be altogether too short a way to deal with Toulmin. His 
historical approach might suggest useful hints as to the proper criteria of ration- 
ality in science, even if these criteria cannot be identified with the actual hazards 
that surviving disciplines have surmounted. Also, and rather more importantly, 
even if Toulmin’s theory could tell us little of interest about criteria of rationality, 
it might still be full of interest as a pattern of explanation for what actually 
happens. Even if it were quite inadequate as a philosophy of science tout court, 
it might still be of great interest as a philosophy of the history of science. So let 
us first see what there is to be said for Toulmin’s evolutionist philosophy of 
intellectual history. 

It is a commonplace that any two things will turn out to have some common 
property. But this may be a rather uninteresting or unimportant property, 
knowledge of which answers none of the questions that we want to ask about the 
things. The issue raised by Toulmin’s quasi-Darwinian account of scientific’ 
development is thus not whether there is anything at all in common between the 
Darwinian explanation of biological speciation and Toulmin’s explanation of 
change and continuity in rational disciplines. The issue is rather whether the 
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features that are common to both are those that are integral in each case to the 
force of the explanation. 

Toulmin apparently claims that the two explanations share every element essen- 
tial to their structure and differ only in respect of inessential features. I shall 
argue, however, that there is one glaring difference of structure between the 
two explanations, which is wholly ignored by Toulmin, as well as another 

. difference that he does notice but does not take seriously enough. And I shall sug- 
gest that Toulmin is in fact faced with an inescapable dilemma. If he adheres to 
his present framework of comparison between speciation and scientific develop- 
ment, then his Darwinian terminology is just a complex, and possibly confusing, 
metaphor. But, if he seeks to impose an ‘evolutionary’ analysis in a genuinely 
‘precise and strict sense of the term’ (p. 134), then his framework of comparison has 
to be substantially reorientated and thereby his explanation of scientific con- 
tinuity and development comes to lose all plausibility. 

Toulmin mentions just two ways in which he thinks his own use of an evolu- 
tionary framework differs from the Darwinian one. First, it has no ‘specifically 
biological details’—no ‘discussions about genetics or predators or water-supply’ 
(p. 139). Rather it is concerned, Toulmin says, with the general relationships to 
be found in certain historical processes. In Toulmin’s view it was a critical 
error of Mach’s, when he attempted to extend Darwinian categories into the history 
of thought, that he conceived the historical development of scientific knowledge asa 

* biological process (p. 320). Secondly, Toulmin claims (pp. 337-9) that in an evolu- 
tionary process the factors responsible for the selective perpetuation of variants 
may or may not be connected with those responsible for the original generation 
of those same variants. Where they are so connected—which Toulmin calls 
‘coupled’ evolution—‘the rate of variation may be comparatively high, without 
the historical species thereby losing compactness, for the coupling itself will tend 
to offset the effects of the unusually rapid variation. In a “decoupled” evolution, 
on the other hand, the range of variation can be unlimited, since the historical 
species is preserved by a selection process which is severe in the treatment of all 
extreme novelties, except where these turn out to be “pre-adapted”’ to a new 
niche’. Toulmin has to admit that conceptual variation and intellectual selection 
are coupled. Conceptual variants are for the most part purposively thought up ir 
order to solve the intellectual problems that beset a discipline. Toulmin also ad- 
mits that contemporary Darwinian biologists emphasise the lack of coupling 
between mutation and selection, in order to discourage providentialist or 
Lamarckian interpretations of organic evolution. The gamete has no clairvoyant 
capacity to mutate preferentially in directions pre-adapted to the novel ecological 
demands which the resulting adult organisms are going to encounter at some 
later time. But he rejects the view that this difference between coupled intellec- 
tual evolution and decoupled organic evolution is crucial. It does not, he thinks, 
debar us from speaking in a quite strict sense of the evolution of intellectual 
disciplines. 

Well, one has to allow the first dissimilarity between Toulminian and neo- 
Darwinian evolution. It would beg the question if one objected to Toulmin’s 
account just because of its subject-matter: one should not object on the ground 
that the history of rational disciplines is not a biologically characterisable process, 
or that unlike the normal evolution of organic species the history of rational 
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disciplines may be strongly influenced by rational purposes. After all, Toulmin’s 
declared objective is to treat ‘Darwin’s account of organic speciation simply as 
one special case of a more general pattern of historical change’ (p. 320). How- 
ever, the coupling-decoupling issue is rather more serious. That difference 
between Toulminian and neo-Darwinian evolution cannot be treated as being 
due just to the difference of subject-matter. Toulmin himself describes the coup- 
ling-decoupling distinction in quite general terms, without reference either to 
specifically intellectual or to specifically organic features. The distinction depends 
just on the existence or non-existence of connections between the factors respon- 
sible for selective perpetuation of variants and the factors responsible for the orig- 
inal generation of those variants. So if anything at all is to be conceived as being part 
of the pattern of evolutionary explanation rather than of its content, one would 
naturally suppose that decoupling should be so conceived. Toulmin’s failure so to 
conceive it is expressly acknowledged by him, but he offers no proper justifica- 
tion. He remarks that he is thus left free to employ evolutionary explanations in 
the discussion of human affairs without getting entangled ‘in the progressivism of 
Lamarck, Spencer and Engels’ (p. 340). But if no other rationale is offered for 
making this. particular breach in the explanatory framework of contemporary 
Darwinian theory Toulmin’s claim to be offering an ‘evolutionary’ analysis in the 
precise and strict neo-Darwinian sense of the term (p. 134) is seriously weakened. 
It seems rather as though he is prepared to alter the meaning of the term quite 
arbitrarily where the normal meaning does not fit his purposes. He speaks’ 
scornfully enough of those who sought to defend doctrines of cultural relativism 
by invoking the authority of Einstein’s relativity theory (pp. 89—91). But he does 
not seem to recognise the risk that he may himself be involved in a somewhat 
analogous solecism. 

- Nor is the coupling-decoupling issue the only difference in pattern between 
Toulminian and neo-Darwinian evolution. The central feature of Darwinian 
explanation is that the identity and evolution of species is explained by the oper- 
ation of the forces of natural selection on individuals. In other words, the chang- 
ing distribution of similarities and differences among the members of any popu- 
lation is controlled by the tendency of members with non-adaptive properties 
not to reproduce themselves. So if we want to make abstraction from the specific- 
ally biological subject-matter with which Darwin himself was concerned, and 
may ignore the coupling-decoupling issue for the moment, we can characterise 
the pattern, or structure, of Darwinian explanation somewhat as follows: within 
any population, or localised aggregate, of environmentally threatened individuals, 
the similarities that are selectively perpetuated are those that are favourable to the 
continued existence of such individuals. If there is any precise and strict sense of 
the term ‘evolution’ that is applicable to non-biological phenomena (irrespective 
of the coupling-decoupling issue), this seems to be it. But again this is not 
Toulmin’s sense of the term. Toulmin wants to be able to apply neo-Darwinian 
modes of analysis ‘not just to species, but to historical entities of all kinds’ 
(p. 356). In particular he wants to be able to apply them to rational disciplines, . 
But a rational discipline is not merely not a biological species: it is not a species 
at all—at any rate as Toulmin conceives it. It does not consist of a set of indiv- 
idual members that are similar to one another in all relevant respects and may be 
said to instantiate it. The individual animal at present on my hearth-rug instan- 
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tiates the species felis domestica, but the concept of mass cannot be said to 
‘instantiate’ modern physics: it merely has an essential role or function therein. 
The relation between a concept and the rational discipline to which it belongs is, 
roughly, that of part to whole, not that of member to species or member to popu- 
lation. As Toulmin himself puts it (p. 166), ‘individual concepts or families of 
concepts bear the relation to complete disciplines that individual roles or institu- 
tions bear to complete societies’. 

Again, concepts of physics are not, as such, all closely similar to one another, 
like the members of a species or population. Rather, they are almost all impor- 
tantly different from one another, like the concept of an electron and the concept 
of a proton. Each such concept may conceivably have a few variant forms that 
are in competition with one another. But the great bulk of the ‘concepts, methods 
and aims’ within a rational discipline at any one time manage to be quite different 
from one another and yet not to be in competition, but in systematic association, 
with one another. When there are important differences between the members of 
an animal population, an opportunity arises for the intervention of natural 
selection, and the continued unity and identity of the species depend on the elim- 
ination of most of these differences. But unless there are important differences 
between quite a number of the concepts within a rational discipline the latter is 
too intellectually poverty-stricken to cope with the complexity of its problems. 
The integrity of a discipline depends on its possessing and preserving an approp- 

*riately rich variety of survivally valuable concepts: the integrity of a biological 
species depends on its possessing an appropriate sparseness of survivally valuable 
variety in the membership of its one or more populations. 

In face of these further discrepancies of pattern between Toulminian and neo- 
Darwinian evolution, Toulmin’s claim to be using the term ‘evolutionary’ in the 
precise and strict neo-Darwinian sense (pp. 134-5) seems hardly more accurate 
than the claim of some cultural relativists to be generalising from relativity 
physics. Perhaps someone will object that this does not matter. ‘If the develop- 
ment of a science is illuminated when we view it as a case of Toulminian evolu- 
tion’, the objector may say, ‘the illumination need not be lost when we recognise 
that Toulminian and neo-Darwinian evolution are different from one another’. 

But the trouble is that the illumination Toulmin hopes to provide here depends 
essentially on the existence of a close parallelism between the development of a 
scientific discipline and the evolution of a biological species. If the parallelism is 
not sufficiently close, Toulmin’s Darwinian analysis is nothing but metaphor and 
the light it sheds is rather a dim one. This is because Toulmin’s object is to solve 
the problem that he thinks Kuhn failed to solve—the problem of how to reconcile 
the Parmenidean element of truth in Frege’s position with the Heraclitan element 
‘of truth in Collingwood’s. He seeks to elucidate how a scientific discipline can 
maintain a non-negligible unity and identity through periods of continuous 
change and development. And the elucidation he proposes is the thesis that the 
identity-through-change of a historical entity like a rational discipline is essenti- 

- ally of the same, evolutionary kind as the identity-through-change of a biological 
species (p. 356). But that thesis is not tenable. Darwin shed an immense flood 
of light on the identity-through-change of a biological species by pointing out 
how the forces of natural selection operated to preserve most similarities between 
the members of a population, while at the same time causing some new 
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similarities to replace old ones. The identity or continuity of the species consisted 
not inany identity or continuity of the individual members, whose life-spans might 
well be quite short—but in the broad spectrum of mutual similarities that the 
members of one generation shared with those of the immediately preceding 
generations. But in the identity-through-change of a scientific discipline there 
is nothing remotely like this. The unity of the discipline cannot consist in similar- 
ities between its constituent concepts, because these have to be different from one 
another in order to perform their proper jobs. Æ fortiori, its unity cannot consist 
of similarities between its concepts that are preserved from one generation of 
concepts to another. 

It is worth while examining this problem of disciplinary unity a bit further. 
Let us grant Toulmin that a scientific discipline at any one time may be conceived 
as a certain kind of amalgam of ‘concepts, methods and aims’. It is pertinent then 
toask: what kind of amalgam? If the relationship between the component elements 
(the ‘concepts, methods, and aims’) is not that of similarity, as between the mem- 
bers of a species, what kind of relationship is it? One cannot elucidate the 
identity-through-change of an evolving biological species without invoking a 
framework of similarity-relations. What is the framework of relationships that 
must be invoked in order to elucidate the identity-through-change of a develop- 
ing scientific discipline? The discipline certainly cannot be described at any 
point of its history, by just giving a list of unrelated elements. We need to be 
told, for example, not just that mass, motion, etc. were the concepts of Newtonian 
physics, and that its methods and aims were such-and-such. We must be told 
also how those concepts were believed to satisfy those aims in conformity with 
those methods. That is, we must be told how those concepts, methods and aims 
fit together within some recognisably familiar pattern of relationships. One very 
common view has been that the core of that pattern is best described, or recon- 
structed, in terms of certain types of relationships between propositions—a 
hypothetico-deductive system, say, or a progressive problem-shift. This is the 
view that has been advocated in one form or another by Campbell, Nagel, Popper, 
Carnap, Hempel, Quine, Lakatos and many others. But Toulmin vigorously 
repudiates this view. According to him ‘No single ideal of “explanation”, or 
rational justification—such as Plato and Descartes found in formal geometry— 
is applicable universally in all sciences at all times’ (p. 156). So it is vital for 
Toulmin to find some other framework within which the elements constituting a 
scientific discipline can be so related to one another as to give it identity-through- 
change. The alleged analogy with biological evolution suggests that the problem 
may be solved by comparing the identity-through-change of a scientific discipline 
with the identity-through-change of an organic species. But the suggestion is 


false because the framework of similarity-relations that are essential to the: 


identity-through-change of an organic species has no counterpart in the case of a 
scientific discipline. 
So, if no Darwinian elucidation of a scientific discipline’s identity-through- 


change is possible, perhaps we should revert to seeking such an elucidation prim- , 


arily in terms of certain patterns of relationship between propositions? Admittedly 
there may sometimes be several logically independent theories or conceptual 
systems that co-exist within a particular science. But this is not an objection, as 
Toulmin supposes (pp. 127-9), to the view that the primary framework of unity 
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is a logical one. For either these sub-theories within a science are in competition 
with one another or they are not. If they are in competition they must be recog- 
nisably concerned with the solution of the same problem, and if the problem is 
describable so too must be the incompatibilities between the proposed solutions.1 
But if the sub-theories are not in competition with one another then their logical 
conjunction may be viewed as composing the relevant system of propositions for 
that discipline at that time. 

There is also yet another reason for rejecting Toulmin’s would-be Darwinian 
account of scientific development. For him the parallel to be drawn is between a 
biological species or population and a scientific discipline, and between the organ- 
isms that compose a species or population and the concepts, methods and aims 
that compose a discipline. And it is evident, as already argued, that such a paral- 
lelism is too tenuous for Toulmin’s purpose. But there are two substantially 
. more accurate ways of conceiving scientific development in Darwinian terms (at 

least if we can disregard the coupling-decoupling issue). Curiously, Toulmin 
does not mention either of these possible analyses, let alone give any reasons for 
rejecting them. 

First, any idea in the history of a science may be regarded as a certain species 
of idea in the minds of individual scientists. So the parallel may be drawn instead 
between a biological species or population, on the one side, and a concept, 
method or aim in the history of a science, on the other, and correspondingly 

„between individual organisms, on the one side, and—on the other side—thoughts, 
of a certain type, in the minds of individual scientists. Indeed, this parallelism 
fits closer with the application of a Darwinian analysis to language-history. For 
the idiolects of individual native-speakers may be said to stand to their language 
as do the members of a biological population to their species,? and the requirement 
of mutual intelligibility between native-speakers of the same language corres- 
ponds then with the requirement of natural inter-fertility between members of 
the same species, as Toulmin himself declares (p. 342). 

But Toulmin does not himself treat the concepts of a discipline analogously to 
the way in which he treats the language of a speech-community. Since he is 
interested in the identity-through-change of disciplines, not of concepts, it is 
the former, not the latter, that are compared to biological species and to lang- 
uages. In Toulmin’s evolutionary analysis the concepts of a scientific discipline 
are ultimate elements, and their instantiations—the ideas in individual scientists’ 
minds, or idio-ideas, as we may conveniently call them here—have no theoretic- 
ally assigned function within his would-be Darwinian schema. Nor would it 
profit him to assimilate his account of concepts here to his account of languages. 
For if he did so assimilate them he would not come anywhere even within sight 
of elucidating the identity-through-change of scientific disciplines. If he treated 
the concepts of a discipline as being analogous to a set of co-existing biological 
1 For an adequate critique of Feyerabend’s well-known claim that the languages of cer- 

tain theories are logically incommensurable with one another, a much longer discussion 

of his views is needed than would be appropriate here. But the root of the matter is that 

* one cannot give a plausible account of what it means to say of two particular theories 

dee that they are, at least in part, about the same subject-matter and also that their 
ges are mutually incommensurable. 


3 a Brosnahan [1960]. Jacques Monod, in his [1970], pp. 154-5, writes of the evolution of 
ideas in this way. 
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species or populations, the unity thereby attributed to the discipline would at 
best be analogous to that of a biotic community. Ina single ecosystem the different 
species of plants and animals that compose a biotic community are held together 
by relationships of benefit, tolerance or exploitation, within the constraints im- 
posed by their common physical environment.! But it is a familiar fact of experi- 
ence that the equilibrium of a biotic community is easily upset, in all sorts of 
discontinuous ways. Forests are cut down, marshes dry up, jungles blaze, soil 
erodes, populations explode, and so on. So such a community cannot plausibly 
be assumed to possess the identity-through-change that characterises a biological 
species. No doubt that is why there are no everyday non-technical terms desig- 
nating particular biotic communities or ecosystems. But we have very many 
names for botanical and zoological species, even in everyday non-technical 
discourse, and thereby we recognise the continuity of such species’ existence as 
being normal. So if we discussed Toulmin’s problem in terms of the ecosystem 
analogy we should not be able to parallel the unity and continuity of an intellec- ` 
tual discipline as well as the species analogy allows. We should achieve a stricter 
pattern of evolutionary description, but we should not be able to explain what we 
wanted to explain. 

A second non-Toulminian form of historical evolutionism might therefore 
seem preferable. Suppose we conceive an intellectual discipline, as a historical 
entity, to be composed of very many disciplines-in-the-minds-and-behaviour- 
of-individual-scientists. Then the latter ‘idio-practices’, as we might call them, 
would stand in much the same relation to the discipline they constitute as do 
idiolects to a language or dialect, or members to a species. The adoption or 
abandonment of an idio-practice would be analogous to the birth or death of an 
organism, and mutual criticism and comprehension among the adopters of idio- 
practices would be analogous to interfertility between the members of a biolog- 
ical species. If the analogy is drawn in this way, the ‘concepts, aims and methods’ 
about which Toulmin says so much would not be members of a Darwinian 
population. As classes of idio-ideas they would instead be classes of the properties, 
parts or characteristics of members of a Darwinian population, viz. of a popula- 
tion of idio-practices. So what would be analogous to the systematic unity of 
idio-ideas in an idio-practice would be the systematic unity of the parts and 
features of an organism in the living plant or animal. This would certainly be 
better for Toulmin’s purposes than the ecosystem analogy. But we should still 
have an analogy rather than an explanation. For while physiology might elucidate 
the working interdependence that unified the parts and features of an organism 
into a living creature we should still lack any correspondingly specific elucidation 
for the working interdependence of idio-ideas in the idio-practice. The Darwinian 
pattern again affords no substitute for theideaof a hypothetico-deductive system, 
say, or a progressive problem-shift. 

In sum, there is no substance in Toulmin’s claim to have achieved an evolu- 
tionary analysis of scientific development that will allow us to dispense with any 
account of science in terms of relationships between propositions. For first, his 
analysis is far from being evolutionary in the precise and strict sense he claims for. 
it. And secondly, even if we choose a much stricter and more plausible applica- 
tion of the Darwinian terminology than Toulmin proposes, it is still inadequate 


1 Rose [1962], pp. 210 ff., and Clarke [1965], pp. 15ff. 
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to the complexity of the phenomenon to be explained. For the self-identity of an 
intellectual discipline, like that of any other historical entity, consists not only in 
its continuity over time but also in its unity and distinctness at any one time. If 
physics today is to be more or less the same intellectual discipline as it was 
yesterday, it must at least be an intellectual discipline in the first place, and that 
means that its various elements must cohere with one another in some fairly 
systematic pattern or patterns. No Darwinian explanation can elucidate this 
coherence, as we have seen. But that is scarcely surprising since it cannot eluci- 
date the corresponding coherence in a living organism either: that is a task for 
physiology. So the ultimate argument against Toulmin, even if he revises his 
evolutionary analysis as best he can, is that such an analysis will still leave a gap 
that needs to be filled by some other type of theory—which will do for an intel- 
lectual discipline or idio-practice what physiology does for an organism. Obviously 
- sucha theory needs to be capable of integration into a historical account that does 
justice both to continuity and to change in the development of science. But even 
the most plausible evolutionary analysis is intrinsically incapable of supplanting 
such familiar paradigms here as the hypothetico-deductive system, the progressive 
problem-shift, etc. 


4 The Weakness of Toulmin’s Arguments against Logico-Analytical Philosophy of 
Science. 
However, we have still to appraise the strength of Toulmin’s negative arguments 
against the more familiar type of view. According to this type of view the unity 
and rationality of a science is best understood in terms of some suitable logically 
sophisticated analyses or reconstructions of scientific theories and their criteria 
of evaluation. But there are many different ways in which these analyses or recon- 
structions may be attempted, and Toulmin’s criticisms do not apply to all of them. 

No doubt some adherents of such a view have tended to oversimplify their 
task by neglecting the pervasive facts of historical change and assuming that their 
typical problem-situation was concerned with the timeless appraisal of a single 
hypothesis or axiom-system against a given body of evidential propositions. 
But such an oversimplification is not an inevitable weakness in the logico- 
analytic approach. It is always important to distinguish the inherent potential of 
a philosophical position from the mistakes of particular philosophers who have 
adopted it; and sometimes one has to explore hitherto neglected aspects of that 
potential in order to be able to appraise the strength of the position properly. But 
Toulmin nowhere concerns himself with the question whether Carnap, Hempel, 
Popper and Lakatos have in fact exhausted the potential of the Fregean position. 
His criticisms of that position therefore have a certain polemical, ad hominem 
quality, which substantially weakens their cogency. 

For example, he claims (p. 170) that ‘in working discussions of scientific theory, 
scientists find little use for the logician’s distinction between “particular” and 
“universal” statements’. Instead, says Toulmin, they make ‘empirical meta- 
‘statements’ about whether a particular theory is applicable universally or only in 
a restricted class of situations. Now, even if what Toulmin says here were both 
true and important (which is open to question), it would still be possible to 
reformulate the inductive logician’s problem in those terms. For, as I have shown 
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elsewhere,! the grade of inductive support that given evidence affords a given 
hypothesis may be conceived as varying inversely with the range of restrictions 
which that evidence enforces on the application of thé hypothesis. No doubt that 
is not how Carnap or Popper conceive inductive support. Nevertheless, it is a 
conception that can be shown to follow in a fairly direct line of development from 
Bacon’s tables of presence and absence and Mill’s methods of agreement and 
difference: indeed, it even embraces certain types of assessment of the extent and 
rationality of conceptual change—viz. where this change is tantamount to narrow- 
ing or widening the meaning of a term.* 

Moreover even if we were to adopt Toulmin’s evolutionary framework of analy- 
sis one of the most serious environmental challenges for any scientific discipline 
must surely be that posed for it by the relevant experimental data. The discipline 
cah survive only in a form that is advantageously adapted to the description and 
explanation of these data. But adaptation here is not a matter of all or nothing any - 
more than it is so in the case of a biological species. The British rat populations 
that have recently become adaptively immune to poisoning by warfarin are not 
yet immune to cyanide. Similarly the hypothesis that survives an initial set of 
tests may be undermined by a more drastic variation of experimental conditions. 
The need for inductive logic—the need for an assessment of the degree of support 
afforded a hypothesis by the experimental data—cannot possibly be eliminated 
by an evolutionary account like Toulmin’s. Inductive logic just reappears within 
the quasi-Darwinian framework as an indispensable branch of intellectual ecology.. 

No doubt, Carnap was wrong to set as his ideal a mode of measuring inductive 
support that would be valid a priori for all branches of scientific enquiry. No 
doubt, as Toulmin says, ‘each effective discipline has had specific goals and ideals, 
which have determined its specific methods and structures, and one central strand 
in its historical development has been the progressive refinement and clarific- 
ation of those goals and ideals’ (pp. 156-7). But the moral to be drawn from this 
is not that any theory of inductive logic is pointless. The moral is rather that if 
a theory of inductive logic is to have any point at all it must generate different 
patterns of assessment in different disciplines, in accordance with the various 
specific types of test, experimental controls, etc., that are severally appropriate. 
Also it must expose each of these patterns of assessment to continuous revision in 
the light of experience within the discipline to which it is appropriate. And again 
such features can be shown to belong necessarily to any theory of inductive logic 
that works out the implications of Bacon’s and Mill’s conception of induction by 
variation of circumstances.’ What variations of experimental circumstances we 
take to be relevant to testing hypotheses about a particular subject-matter at a 
particular time are bound to depend both on the nature of the subject-matter and 
on previous experience within the same field. We now know that it is a mistake 
not to test drugs like thalidomide for the possibility of toxic effects in cases of 
pregnancy. 

There is yet another point at which Toulmin’s account of rationality raises 
familiar problems about inductive reasoning. He asserts (p. 503) that ‘the burden 


1 Cohen [1970], Section 16. 

2 Ibid., pp. 74, 125 ff. and 143 ff. For the inductive logic of choice between theories where 
more radical conceptual changes are at issue, cf, Cohen [1971]. 

3 Cohen [1970], Sections 5-9. 
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of “rationality” ... consists in the fundamental obligation to continue reap- 
praising our strategies in the light of fresh experience’. Few would dispute this. 
But the philosophical problem here is posed by the question why it is rational 
thus to keep building on our experience. If Toulmin’s proposed answer is that 
this affords our concepts an adaptive advantage in their competitive struggle for 
survival, then the answer is question-begging along lines that are scarcely unfam- 
iliar in relation to the problem of induction. For we can still ask why it is rational 
to assume that adaptation to previous experience will continue to be as advan- 
tageous as it has been. 

In sum, Toulmin holds (p. 479) that problems about inductive logic are not 
intrinsically bound up with problems about scientific rationality. But his view is 
plausible only at a very superficial and general level of statement, where the 
current state of Carnapian confirmation-theory is mistakenly taken to represent 
the total possible contribution of inductive logic and the evolutionary account of 
scientific development is not pressed out into its detailed implications. 

Nor need one suppose, in order to acknowledge the fundamental importance 
of inductive logic, that such a logic is the sole and ultimate arbiter of rationality in 
scientific development. It is rarely sufficient, though it is always necessary, to 
take the relevance and the extent of evidential support into account in assessing 


. the acceptability of a proposed change or novelty in the content of a scientific 


discipline. To put the point in the metaphor of Toulmin’s quasi-Darwinian 
framework, though inductive logic is an indispensable branch of intellectual 
ecology, it is not the only branch. Acceptability—or, in Toulmin’s terms, 
survival-value—is bound to depend also on the consideration of such utilities as 
computational facility, explanatory depth, technological applicability, intellec- 
tual fruitfulness and so on. But even Carnap had for many years seen his induc- 
tive logic as determining just one element within the larger whole of normative 
decision theory.! No doubt the relevant utilities may vary widely from discipline 
to discipline. No doubt they as yet seem very hard to measure or compare with 
any degree of accuracy. But Toulmin says nothing to justify rejecting the view 
that in principle all such considerations admit of treatment within a systematic 
theory of acceptance or rational decision-making. 

If natural science itself progresses by the discovery of underlying uniformities 
beneath the surface heterogeneities reported by natural history, a philosopher of 
science can hardly object to the attempts of inductive logiciansa and decision 
theorists to discover underlying uniformities of structure within the heterogen- 
eous modes of rational assessment that they find reported in the history of science. 
If a precedent is sometimes to be drawn from one discipline for the benefit of 
another, as Toulmin himself declares (pp. 86-9, 491-3), there must be an under- 
lying general principle it embodies—analogous to what lawyers call a ratio 
decidendi. Toulmin apparently wants philosophy of science to turn away here 
from the search for underlying uniformities and content itself with the piecemeal 
description and comparison of assessments in particular disciplines at particular 


. times. But that is not the true path of a rational discipline, and philosophy of 


science sets out to be a rational discipline. Certainly physics and chemistry would 
not have made much progress if they had contented themselves with the super- 
ficial Aristotelianism that treated ice, water and steam as different kinds of 


1 Cf. Carnap [1971], which expands his [1960]. 
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substance. The hunt for a comprehensive theory is often very lengthy, and may be 
led astray by many false scents. But that is no reason for giving it up altogether. 


5 Some Factual Errors in the Text. 


Toulmin regards the problem of human understanding as being necessarily a 
matter for interdisciplinary enquiry. He mentions specifically the physiology of 
perception, the sociology of knowledge and the psychology of concept formation, 
as being among the disciplines involved (p. 7). It is obvious too, from his frequent 
use of historical material, that he also regards the history of science as having 
an essential role in the enquiry. And all this is in keeping with his contention that 
a merely logical analysis cannot grasp the true nature of human rationality, It is 
therefore particularly unfortunate that his text seems to contain many factual 
errors. In the kind of philosophical writing in which occasional facts are cited in 
order to illustrate some pattern of logical analysis the reader’s confidence in the 
foundations of the argument is not necessarily shaken if a substantial proportion 
of the alleged facts turn out to embody inaccuracies or misrepresentations. If 
other illustrations can be found, the force of the argument may remain unaffected. 
But a philosopher who makes a great virtue of his dependence on the concrete 
richness of his interdisciplinary knowledge needs to be rather more careful about 
his facts than Toulmin has in fact been. I append a list of the errors I have noted in 
passing. Theseerrors vary in their degree of centrality or importance for Toulmin’s 
argument. But it is difficult not to suspect that a better informed reviewer, or 
a more persistent investigator, might discover many more such, since the ertors I 
do list, while sometimes unimportant in themselves, suggest fairly clearly that 
Toulmin is often content with a rather cavalier and impressionistic treatment of 
historical fact. 

These errors may be divided into two groups. The first six are inaccuracies 
or misrepresentations in relation to the views of other workers in the same field. 
The other five are mistakes in relation to the history of science or culture. 


I. On pages 62-3 Toulmin attributes to Hempel the thesis that his (Hempel’s) 
analysis of confirmation is expressed in ‘the language of science’ (my italics) and ` 
that what he (Hempel) means by ‘the language of science’ is ‘the lower functional 
calculus with individual constants . . . universal quantifiers for individual 
variables, and the connective symbols of denial, conjunction, alternation and 
implication’. But what Hempel actually said, in the passage: cited by Toul- 
min! was that the logical structure of the languages to which his defin- 
ition of confirmation is applicable is that of the lower functional cal- 
culus, etc., while he expressly recognised the difficulty of defining confirmation 
for any scientific language that has a more complex structure and richer means 
of logical expression. The phrase with the definite article—‘the language of 
science’-—, which Toulmin unambiguously attributes to Hempel, does not occur 
at all in this passage of Hempel’s text. ` 

z. Lakatos fares no better than Hempel. In discussing Lakatos’s ‘History of 
Scienceand its Rational Reconstructions’,* Toulmin says (p. 482) the following: 


1 Hempel [1945], p. 108, footnote 1. Cf. Hempel [1965]. 
3 Lakatos [1971]. 
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‘Lakatos’s new account certainly promises to be an advance on Popper’s 
position ... Thus in direct opposition to Popper he says “If a demarcation 
criterion between what is, and what is not, scientifically ‘rational’, is incon- 
sistent with the basic appraisals of the scientific élite, it should be rejected”’.’ 


But not only is this an inaccurate quotation from Lakatos’s paper (op. cit. 
p. 111). It also makes the mistake of attributing to Lakatos a view which 
Lakatos himself proposes only tentatively (op. cit. p. 111), stigmatises as naive 
(op. cit. p. 110), considers himself as replacing by a better view (op. cit. p. 116), 
and actually does replace by applying his theory of degenerating and progressive 
problem-shifts to the assessment of research programmes in the problem of 
scientific rationality (op. cit. pp. 118-20). 

3. On pages 24-25 Toulmin describes himself as rejecting the twentieth- 
century tradition of epistemological discussion that began with Moore and 
Russell, in order to ‘start again from scratch... which will mean abandoning the 
philosophical autonomy of epistemology, and re-opening closed doors between 
formal philosophy and more substantive disciplines’. But did not Russell him- 
self often open those doors—for example, in his Human Knowledge, Its Scope and 
Limits (1948)? The whole of Part I of that book is devoted, Russell claims (p. 10), 
to an up-to-date summary of the scientific knowledge that philosophers of science 
must take as their datum. 
+4. On page 27 Toulmin writes: ‘We shall deliberately begin by setting aside 
the older question, “How does the perceiving Mind, from its standpoint deep 
within the Brain, acquire information about the External World?” so re-opening 
forgotten options’. But those who have forgotten these options must have very 
short memories or be very ill-read. After Ryle’s Concept of Mind (1949), with its 
exorcism of ‘the ghost in the machine’, very few philosophers have been content 
to pose the problem in the form that Toulmin claims to be setting aside. 
5. On pages 448-9 Toulmin states that ‘Chomsky himself has several times 
argued, with some vehemence, that the existence of the same grammatical patterns 
in all human language is sufficient evidence that the infants born into all language- 
using populations possess identical innate endowments’. But this is a travesty of 
Chomsky’s argument for innatism. For first, Chomsky has never claimed to 
possess evidential data about all human languages; and secondly, his arguments 
for innatism have always depended quite substantially on facts about the short- 
ness of time available for childhood language-learning, the degeneracy of the 
child’s auditory input, the sureness with which complex underlying structures 
are grasped, and go on. 
6. Where Toulmin misrepresents the views of other thinkers, the effect is often 
(as in the above cases) that his own position appears more sophisticated than theirs 
But this is not always so. Occasionally a misrepresentation suggests, wrongly, that 
the researches of others support his views. For example on page 336 Toulmin 
states that ‘the Darwinian pattern of explanation has been used in philosophy of 
science, with great sensitivity in G. Holton, ‘‘Scientific Research and Scholarship”, 
Daedalus, 92 (Spring 1962), pp. 362-99’. But what Holton tried to do in this 
article (which in fact appeared in volume 91 of Daedalus) was to give a detailed 
analysis of the growth of academic science (and specifically of U.S.A. physics) as 
a profession. He never mentioned Darwin, evolution, populational explanation 


a 


58 L. Jonathan Cohen ° 


or anything of the kind. Instead he offered quite a different model, based on the 
analogy of a voyage of exploration. 

7. I now pass to some errors of intellectual history. On page 47 Toulmin states 
that ‘Montesquieu and Voltaire . . . knew little about culture and societies out- 
side the Middle-Eastern and European traditions. China and India had barely 
emerged from the realms of the fabulous’. But it really will not do to describe the 
long and scholarly labours of the Jesuits in these terms, and to speak as if 
eighteenth-century sinology was still operating at the level of Marco Polo. 
Quite apart from such earlier figures as Gabriel de Magaillans (Nouvelle Relation 
de la Chine, Paris, 1689) and Louis Daniel le Comte (Nouveaux mémoires sur létat 
présent de la Chine, Paris, 1696), the encyclopaedic achievements of Jean Baptiste 
du Halde can hardly be ignored altogether (Déscription geographique, historique, 
chronologique, politique et physique de l’emptre de la Chine et de la Tartarie Chinoise, 
Paris, 1735). To produce the 38 maps in the latter volume the Emperor of China * 
paid eight Jesuits to travel over east and central Asia for nine years, equipped with 

as good surveying instruments as were ever used on a European map of the same 

date. What is fabulous about this? Moreover Montesquieu had obviously read 

du Halde and often quotes from him (e.g. in Esprit des Lots, book XIX, chapters x, 

xiii, xvi and xvii, Geneva, 1748), referring to Chinese customs and ideas 

alongside those of Persia, Egypt, Greece, Rome, etc. 

8. On pages 247-8 Toulmin accepts without question Duhem’s distinction 

between ‘the French esprit géometrique’ and the ‘British esprit d'ampleur’. On the* 
one side there are the personified virtues of Racine, (thus Toulmin:Duhem 

actually mentioned Corneille here, not Racine,) the abstract generality of the 

Code Napoléon, Descartes’ mathematical rationalism, and, Toulmin adds, the 

geometrical precision of a French parterre. On the other side there are the indivi- 

dual characterisation of Shakespeare’s heroes and heroines, the concrete partic- 

ularity of common-law precedents, Bacon’s method of empirical generalisation 

and ‘the deceptively natural sweep of the traditional English garden’. But 

Toulmin is quite wrong about gardens. In the days of Bacon and Shakespeare 

English gardens had just as much geometrical precision as French ones, and after 

William Kent had achieved his revolution in landscaping the new vogue of the 

Romantic garden spread across Europe. Moreover Charles Dickens’s very 

English novels are full of personified virtues, just as Proust’s very French 

sensitivity issues in highly individual characterisations. The English jurist 

Bentham deduces laws of universal applicability from philosophical principles of 
great generality: the French philosopher Montesquieu was much more tolerant 

and empirical in his approach to the problems of legislation. Toulmin is alto- 

gether too ready to accept Duhem’s chauvinistic crudities. 

9. Duhem is grossly unfair to Oliver Lodge, and Toulmin relies on Duhem’s 

account of Lodge’s views without querying it. Duhem attributed to Lodge 

a mechanical model for electrical theory, and criticised him with the famous 

words: ‘We thought we were entering the tranquil abode of reason but we find 

ourselves in a factory.” However, Lodge was perfectly clear that his mechanical, 
analogy was solely a‘didactic device for a non-mathematical audience (cf. the 

book’s Advertisement) and he never claimed that it represented the true nature of 
electricity. On page 186 of his book he called it ‘a provisional representation’, 
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and remarked that ‘without attempting a complete and satisfactory represent- 
ation of what is going on, we can think of some mechanical arrangements which 
have some close analogy with electrical processes’. Similarly on page 269 Lodge 
described his diagram as ‘a very crude suggestion of a mechanical analogy to 
what may be taking place’. So it is just an ignoratio elenchi to criticise Lodge for 
not giving a mathematical account of electrical phenomena, when he was expli- 
citly striving to convey something—however inadequately—about the nature of 
electricity to readers who could not appreciate any mathematical treatment of the 
subject. To regard Lodge’s efforts at popularisation in this book as an example of 
differences in national style between British physics and French physics is a mis- 
representation of the facts which Toulmin has copied from Duhem. There may 
be examples of such differences in national style of physical theory. Specifically, 
there may even have been ‘rival French and British strategies in electrical theory’, 
as Toulmin claims (p. 250). But if there is any difference at all that Lodge’s book 
can serve to illustrate it would have to be a difference in styles of popularisation; 
and since neither Duhem nor Toulmin cites any French examples of this genre 
their argument remains quite incoherent. (Toulmin makes a similar mistake 
about Maxwell and Kelvin on p. 249, where he says “The mechanical models 
used by Maxwell, Kelvin and Lodge to explain electrical phenomena simply 
served as alternative representations or projections, of those same intellectual 
relations that figured as formal entailments in French treatises of mathematical 
physics.’ That Maxwell came to reject the use of mechanical models as explana- 
tions of electrical phenomena is clear from the passages quoted by M. Hesse.1 
That, Kelvin too treated his mechanical models as not being true representations 
of nature is clear even from Duhem’s citations in The Aim and Structure of Phys- 
tcal Theory.*) 
10. On page 284 Toulmin states, in a footnote: ‘It has been remarked, with some 
justice, that the political slogans of the “England-returned” men, who took 
over power in the Anglophone countries of Africa and Asia during the 19503 and 
1960s, bore less relation to their actual policies—which were decided very largely 
on pragmatic grounds—than they did to the views circulating among Harold 
Laski’s research students during the particular years when they were themselves 
studying at the London School of Economics.’ But while such a remark may be 
perfectly in place across the dinner-table, or in the gossip column of a newspaper, 
it hardly reaches the level of accuracy suitable to a three-volume treatise on the 
philosophy of science. The remark implies that all, or nearly all, these politicians 
studied at the London School of Economics. But Nehru and Senanayake were 
at Cambridge, Banda and Nyerere at Edinburgh, Khama at Oxford. No doubt 
there were also a few politicians in this category who were at the London 
School of Economics, e.g. Kenyatta, but hardly as many as elsewhere. Toulmin’s 
acceptance of the remark quoted in his footnote is symptomatic of a cavalier 
attitude to the facts which is inconsistent with his professed respect for historical 
evidence. 
11. In pages 380-95 Toulmin discusses the characteristics of what he calls 
* ‘would-be disciplines’, as distinct from ‘compact’ ones. Physics, of course, is his 
paradigm of compactness, while psychology, sociology and anthropology are 
cited as examples of would-be disciplines. But, whatever may be true of sociology 


1 Cf. Hesse [1961], pp. 208-9. 2 Duhem [1906], p. 84. 
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and anthropology, much of what Toulmin says or implies about contemporary 
psychology is wide of the mark. One of the characteristics of a would-be discipline 
according to Toulmin is that theoretical debate in the field concerned ‘becomes 
largely—and unintentionally—methodological or philosophical; inevitably, it is 
directed less at interpreting particular empirical findings than at debating the 
general acceptability (or unacceptability) of rival approaches, patterns of explan- 
ation, and standards of judgement’. But anyone who reads through the current 
issues of a dozen or so leading psychological journals will find that this is just not 
so in contemporary psychology. Again, according to Toulmin there is no con- 
sensus among psychologists about what kinds of thing are puzzling about human 
behaviour. But the problems studied in the contemporary literature are much the 
same as they have always been: for example, the mechanisms of perception, 
memory, learning, communication, and motivation. Again, according to Toulmin 
‘few authoritative reference-groups have developed in theoretical psychology 
of the sort that exist in the physical sciences. Rival factions are separated less 
by a methodical and agreed sub-division of the outstanding problems than by 
radical differences of theoretical approach, and each group tends to operate 
under the magisterial dominance of its own “‘chief” or “high priest”. . .. Rival 
factions work with different conceptions of what a “behavioural science” should 
be, and organise their literature less on an agreed specialisation of functions than 
on conflicting claims to sovereign independence’. But whereas some of this may 
have been true in some countries ten, twenty or forty years ago it does not 
describe the overall contemporary situation correctly. There are authoritative 
reference-groups in academic psychology, at least in the U.K. and U.S.A., as 
anyone engaged in obtaining public money for psychological research is aware. 
There may still be deep divisions within clinical psychology. But in academic 
psychology the situation is rather different now. Standard differences between 
ethological, experimental and neurological methods are normally construed by 
the present generation of those who practise them as being differences between 
mutually complementary approaches to a set of common problems. Even psycho- 
linguistics is settling down, after a stormy decade, to a fairly tolerant and open- 
minded outlook. Nor are such bodies as the Experimental Psychology Society or 
the British Psychological Society dominated by ‘chiefs’ or ‘high priests’, whether 
native or foreign. Again, according to Toulmin, ‘suspicions of methodological 
heresy prevent fruitful intellectual debate between members of different factions’. 
But, as an example of such debate, what about Roger Brown’s work t on the 
Skinnerian theory of syntax-learning, or the controversy between Braine, Bevor, 
Fodor and Weksel about the learning of word-order?® 

No doubt there are many important differences between physics and psychology 
other than differences of subject-matter. Perhaps too if the psychological sciences 
ever attain a more advanced form we shall then be able to recognise more easily, 
and characterise more exactly, those features of immaturity which they ex- 
hibited in 1972. But Toulmin has certainly got some of those features wrong. 


L. JONATHAN COHEN’ 
The Queens College, Oxford 


1 Cf. Brown [1970], pp. 198-202. 
2 Cf. Braine [1963], Bevor, Fodor and Weksel [1965] and Braine [r965]. 
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HOW ARTIFICIAL IS ARTIFICIAL INTELLIGENCE?* 


artificial contrived (opp. to spontaneous): made by man: synthetic (opp. to natural): 
fictitious, made in imitation (opp. to real): (obs.) ingenious: (Shak.) perh. creative, 
playing the artificer, or perh. merely skilful: (obs.) technical. 
Chambers’s Twentieth Century Dictionary 


The artificiality of artificial intelligence is of controversial logical status. Some 
claim that the contrived and unrealistic character of current machine intelli- 
gence is contingent rather than essential, that despite the formidable problems 
involved there is no reason in principle why a manmade system (of inorganic, and 


* Review of F. J. Crosson (ed.) [1970]: Human and Artificial Intelligence, New York 1970, 
Appleton-Century-Crofts, $4.40, pp. vi+267; B. Meltzer and D. Michie (eds.) [1969]: 
Machine Intelligence 5, Edinburgh 1969, Edinburgh University Press, £7.00, pp. vili+ 588; 
and B. Meltzer and D. Michie (eds.) [1971]: Machine Intelligence 6, Edinburgh 1971, 
Edinburgh University Press, £10.00, pp. 525. 
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perhaps even digital, construction) should not match all important aspects of 
spontaneous human thought. Others argue that there are unavoidable and 
significant limitations on the extent to which intelligence of the synthetic variety 
could simulate its natural precursor, and that achieved ‘successes’ are fictitious 
rather than real since they rely directly on the ingenuity and skill of the 
human artificer and not on any creative potential attributable to the artifice. 
Claim and abstract counterclaim are scattered throughout the cybernetic, 
psychological, and philosophical sources. Meanwhile, continuing research in 
machine intelligence attempts to provide programs—-or, sometimes, useful 
prolegomena to programs—directed to problems of a type commonly faced by 
man’s intelligence, as well as dealing with problems of a more specialised and 
technical nature.! In this article I shall review some specific instances of current 
research in the light of general points arising in the background literature. 

In 1947 Turing wrote a paper on intelligent machinery for presentation to the . 
` National Physical Laboratory. In this (previously unpublished) paper he an- 
alysed and forecast some of the possibilities inherent in the stored-program 
digital machines which were then first becoming operational. As one might ex- 
pect from familiarity with his published writings, Turing gave short shrift to 
those who assumed that machines could not possibly show intelligent behaviour. 
Whether in spite of or because of his often rhetorical responses to this opposi- 
tion, an eye-witness records that he threw the National Physical Laboratory into 
a furore, some of the audience experiencing the nonbeatific vision of Turing’s, 
‘infesting the countryside with a robot which will live on twigs and scrap iron!’ 
Less apocalyptic anticipations on this occasion were Turing’s prophetic outlines | 
of the notion of subroutines and of the type of mechanised problem-solving that 
depends on theorem-proving algorithms. Turing’s definition of intelligence 
stressed both ‘discipline’ and ‘initiative’, and as a typical example of a task 
requiring initiative he suggested problems of the form ‘Find a number n such 
that ...’, confidently claiming that ‘We should not go far wrong for the time 
being if we assumed that all problems were reducible to this form. It will be time 
to think again when something turns up which is obviously not of this form.’ 
Much effort in machine intelligence has indeed been directed to questions of this 
general form, and many second-order discussions of theorem-proving appear in 
the technical literature. But we shall see later that it is doubtful whether 
Turing’s confidence in the universality of such questions (and so of theorem- 
proving methods in general) was well grounded. Turing denied that there could 
be any purely objective definition of the notion of intelligence as commonly 
understood, since he believed that the extent to which we regard something as 
intelligent is determined as much by our own state of mind and training as by the 
properties of the object under consideration. In illustration of this claim he 
wrote: ‘If we are able to explain and predict its behaviour or if there seems to be 
little underlying plan, we have little temptation to imagine intelligence. With the 
same object therefore it is possible that one man would consider it as intelligent 
and another would not; the second man would have found out the rules of its 


1 The Proceedings of fecent meetings of the Annual Machine Intelligence Workshop 
edited by B. Meltzer and D. Michie (op. cit., above, footnote*), 

3 Méltzer and Michie [1969], Prologue. 

2 Meltzer and Michie [1969], chapters 6-11 and Meltzer and Michie [1971], chapters 4-8. 
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behaviour.’ Consideration of subsequent work in artificial intelligence shows 
that the implied contrast between processes with some underlying plan and pro- 
cesses following rules is less sharp than Turing suggested in this early paper. 

Only ten years after Turing’s address to the NPL, his vision of mechanised 
intelligence was achieving initial realisation in the form of programmes such as 
Samuel’s checker-playing program, various chess programs, and the General 
Problem Solver (GPS) of Newell, Simon, and Shaw. Samuel’s checker-playing 
program? is commonly regarded as a landmark in artificial intelligence work. It 
incorporated features such as heuristic search, alpha-beta pruning, and machine 
‘learning’ by way of automatic improvement of evaluation procedures; a later 
version of the program stressed the importance of stable strategies and of inter- 
actions between parameters that had previously been treated independently.” 
- Ina review of the current state of the art of chess programming, Levy? expresses 
- "surprise that Samuel’s method for allowing the program to improve its own 
coefficients has not been more widely followed. Levy also points out that rather 
little progress has been made in chess programming since the early days (although 
a few excellent general discussions of the area have been provided), since the 
favoured strategy for writing programs appears to omit the initial steps of reading 
the literature in order to profit from the mistakes and insights of others, and 
thinking about what chess playing (and so chess programming) is all about. The 
weakness and artificiality of current chess programs, he claims, is largely due to 
the programmers’ lack of attention to the natural intelligence of strong chess 
players. 

GPS was particularly influential, both in the professional development of the 
information processing approach to human and artificial intelligence and in its 
subsequent popularisation. Some of the basic assumptions of this approach were 
clearly stated by Simon and Newell in the Scientific American.* Their central 
claim was that a small set of elements, similar to those postulated in information 
processing languages, is sufficient for the construction of a theory of human 
thinking—in other words, there is no essential difference between artifice and 
artificer. They relied extensively on their experience with GPS, a program incor- 
porating a number of general-purpose information processing techniques that 
the programmers believed to be at the heart of problem solving in general. 
Indeed, Newell and his colleagues explicitly drew on and imitated human example 
when designing GPS. For instance, one of the most important features of their 
program was suggested by the fact that—as Turing had implied—the human 
problem solver typically plans his thinking when faced with problems of any 
difficulty. References to their early recognition of the importance of planning 
are scattered throughout the current cybernetic literature, for the concept is 
acknowledged to be crucial to creative intelligence. 

In everyday discourse the term ‘planning’ is often used to denote any thought 
process where the agent ‘thinks ahead’, making decisions in the light of his pre- 
dictions as to the probable outcome of the various possible choices. Workers in 
artificial intelligence sometimes use the term in this purely anticipatory way: 
for instance, Doran’s discussion of ‘Planning and Robots” concentrates on the 
1 Crosson [1970], chapter 4 * Crosson [1970], chapter 4. 
9 Meltzer and Michie [1971], chapter rz. “ Crosson [1970], chapter 2. 
5 Meltzer and Michie [1969], chapter 27. 
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robot’s need to consider possible courses of action in the light of some desired 
goal, and to select and carry out the most promising. An example of a problem 
solving program which employs ‘planning’ in this sense is the Graph Traverser 
described by Michie and Ross. Problems that can be solved by this program 
include sliding-block puzzles and ‘commercial traveller’ route-finding problems. 
The Graph Traverser interprets all problems either as the task of finding a path 
across a graph from one particular node to another, or as the task of finding a 
node satisfying certain constraints. In other words, for a problem to be accepted 
by this program, it has to be represented as a mathematical graph—the nodes of 
the graph correspond to possible states of the problem (starting state, inter- 
mediate states, goal state) and the arcs correspond to possible operations that 
transform one state into another. The operators have no unknown side-effects: 
indeed, they have no side-effects at all, their sole result being the transformation 
of a single state into a different one. The essential ‘thought process’ or solution `° 
procedure of the Graph Traverser is to grow a graph from the starting node given 
by the conditions stated initially in the problem, and to search this tree structure 
until the desired node is found. Various heuristics allow for step-by-step evalua- 
tion of each state with respect to its distance from the desired state, and for con- 
sequent pruning of the potential search tree so that the tree actually generated is 
the one most likely to sprout the goal-node. In effect, the program deliberates on 
alternative possible sequences of operations and selects the most efficient, thus 
satisfying the anticipatory sense of ‘planning’. 

However, Simon and Newell’s sense of planning is stronger than this, for it 
specifies an anticipatory strategy which reduces the overall complexiy-of the 
original problem in a particular way. In their words, ‘the essential ideatin plan- 
ning is that the representation of the problem situation is simplified by deleting 
some of the detail. A solution is now sought for the new, simplified, problem, and 
if one is found, it is used as a plan to guide the solution of the original problem, 
with the detail reinserted’. This strategy is important because step-by-step 
heuristics, such as those incorporated into the Graph Traverser, are inadequate 
for really difficult problems, where the machine must be able to analyse the prob- 
lem structure as a whole. But if the machine has a facility for planning, in Simon 
and Newell’s sense, it can use a simplified version of the problem as a whole (not 
to be confused with a part of the whole problem) as a model for solving the 
complete problem. ‘Planning’, so defined, is a truly general problem-solving 
strategy, though specific instances of planning may differ widely from case to 
case in both natural and artificial systems. 

The simplest case of planning would be one in which the solution of the model 
problem in and of itself generated a solution of the more detailed problem. An 
example is available from Simon and Newell’s own work with GPS.* Given the 
logical problem of proving ‘C’ from the three premises ‘A’, ‘not A or B’, and ‘If 
not C then not B’, the program abstracts from both states and operators to note 
that the conclusion contains C alone, while the premises respectively contain A 
alone, A with B, and B with C. GPS then generates the plan of obtaining B by 
somehow combining 4 with AB, then of obtaining C by combining B with BC. 
The solution of this problem requires ‘not Æ or B’ to be transformed into ‘A 
implies B’, and ‘if not C then not B’ to be transformed into ‘B implies C’; these 

1 Ibid., chapter x6. 2? Crosson [1970], chapter 2. 
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transformations follow from the definitions of the operators ‘or’ and ‘if...then’, 
which are available to the program. Since these operators, connecting the relevant 
-arguments, are included within the original formulation of the problem, to solve 
the simplified problem therefore is to solve the more detailed one. 

Cases of planning where there is rather more conceptual distance between 
solution of the model problem and solution of the original problem may still be 
relatively simple, in the sense that both problems are clearly of the same general 
form, the one being a small-scale version of the other. This is literally true of a 
recent program described by Kelly! where the task is to extract an accurate 
outline of a person’s head from a black and white photograph of the person 
standing in front of various backgrounds containing a number of other 
objects. The output of the program is an edge picture of the head alone 
—-that is, the edges of background objects and the interior details of the 
head have been suppressed. (Kelly’s overall purpose is to improve general 
techniques for picture processing by computers, but he provides an ominous 
reminder that the Bertillon system for personal identification via measure- 
ments of the face and body had wide use in police work prior to the dis- 
covery of the usefulness of fingerprints.) The technique employed involves a 
form of planning in which a smaller, less detailed, picture is extracted from the 
original picture; the desired edges are located in the small picture, and are then 
used as a plan for finding edges in the original. In principle, the power of the 
program could be increased by allowing for this planning procedure to be applied 
recursively, the program itself calling for further size reduction if in its esti- 
mation there is too much detail at the current level. If there is too much detail 
in a picture then some of the edges may be difficult to find; for instance, edge 
followers may be misled onto temporarily strong false paths retreat from which 
involves large amounts of backtracking; and genuine edges depicted in grey- 
scale pictures often vanish to reappear some distance away. It is for this sort of 
reason that the preliminary processing of a less detailed smaller picture is useful. 
Since there are fewer edges to consider in the small picture, false paths are de- 
tected more easily; and backtracking is less of a problem because the data- 
structures that have to be erased are much smaller. However, to find the head 
edges in the small picture is not in itself to find them in the larger picture. Rather, 
the plan is used as a guide to final edge-detection. If there is a straight edge 
between points A and B in the plan, the program concludes that there must be an 
approximately straight edge between points A’ and B’ in the original picture; the 
plan-following edge-detector therefore searches a relatively narrow band between 
A’ and B’ so that false trails are quickly recognised as such. It should be noted that 
significant edge-finding in the plan depends upon prior assumptions about the 
usual shape of human heads which are drawn from the programmer’s knowledge: 
thus the correct outline of a unicorn’s head would not be found by Kelly’s pro- 
gram; nor would the devil’s slender horns be detected by a program (or a person) 
assuming that straight edges in the smallscale plan must correspond to approxi- 
_ mately straight edges in the original. In other words, Kelly’s program does not 

function in an epistemological vacuum, but knows something about what it is 
looking for—namely, human heads. A general purpose edge-finder could be 
developed by feeding in relevant knowledge if there was available a general 
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language of picture description in terms of which the programmer could com- 
municate with the machine. Since there is no such language at present, Kelly’s 
machine is highly artificial in that it can recognise only one narrowly defined class 
of pictures. But it is ‘natural’ in taking into account global information about 
overall structure as well as purely local information. Global information about 
the structure of relations between local edges (if the ‘top of the head’ has been 
identified, the ‘sides’ must slope down rather than up) is used in forming and in 
using the plan. 

The strategy. of planning shows the importance of finding an ‘appropriate’ 
alternative representation for a problem—that is, one that enables the problem 
solver (man or machine) to approach the reformulated task with increased likeli- 
hood of success, given the particular information processing abilities available. 
This is often the most intelligent aspect of the whole thought process. Simon and 
Newell’s definition of planning should not be taken to imply that, while there is 
simplification by way of deletion of detail, the general form of representation 
must be the same in the plan as it is in the original statement of the problem. 
Sometimes this is the case: for instance, GPS deletes some of the logical infor- 
mation originally given in the problem, while Kelly’s edge-detector tempor- 
arily ignores pictorial details. But sometimes the alternative forms of representa- 
tion involved are more diverse than they are in GPS or in the edge-detector. 
Such cases raise questions about what it is for several representations to be 
equivalent to one another, how novel representations can be generated, and what 
are the general constraints which decide to what problems a given form of repre- 
sentation is appropriate. These questions have been considered with reference to 
artificial intelligence contexts by Amarel in a number of papers, including a 
recent discussion of a class of problems he calls ‘formation problems’.t 

It is characteristic of formation problems (in contrast with ‘derivation problems’ 
such as the logical theorem proving of GPS or the graph theoretical problems of 
the Graph Traverser) that the problem conditions are stated in a language 
different from that in which the desired solution will be stated. The crucial 
difficulty, then, is to find a way of moving between these two languages (a mech- 
anism for mapping the relations between them) which can help to locate the 
solution. In the context of machine intelligence, the goal is to construct by 
computer a program—in a given programming language—that satisfies condi- 
tions expressed in another language in the form of input-output correspondences. 
Amarel’s main point is that this task may be achieved indirectly, in the sense that 
candidate solutions are represented in terms other than (and somehow simpler 
than) the final language itself; it is within this representation that they are gen- 
erated, and then evaluated by reference to the problem conditions; the selected 
candidate is then transformed into the final expression of the required solution. 
This may be regarded as a form of planning, but Amarel’s interest lies in cases 
where the alternative forms of representation may be superficially very diverse. 
His general approach is first to find a grammar that can describe the language of 
constructible programs and assign structural descriptions to individual programs , 
(in other words, a set.of generative rules for program construction); second, to 
use these structural descriptions to project putative programs in an algebraic 
model, where they can be manipulated as relational expressions and where 

1 Meltzer and Michie [1971], chapter 23. 
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structure-performance relationships can be analysed; third, to use this 
algebraic model as a basis for a radical change in representation of the 
overall program formation problem, with a consequent change in strategy 
of solution so as to exploit the implicational structure of the model. In 
effect, the original formation problem is converted into a set of derivation 
problems. Clearly, the choice of grammar is crucial, given that there is no unique 
grammar but rather an indefinite number of possible grammatical descriptions 
of the programming language concerned. The chosen grammar must assign struc- 
tural descriptions that lead to the discovery of regularities between program 
structure and function that can be used in the search for a solution. Amarel con- 
cludes that the grammar should allow for each structural description in the gram- 
mar to be interpreted in a mathematical system where there exists a structure of 
relationships that can be used in relating program structures to the functional 
conditions of. the desired program. Such a mathematical model will also show 
formal properties of programs such as correctness, equivalence, and inclusion, 
matters which are the concern of many technical papers in machine intelligence.+ 
Amarel illustrates these general principles by producing a grammar of programs 
in which the data-flow of programs is represented by a form of directed graphs 
he calls river-like graphs; the model corresponding to this grammar is a modified 
algebra of relations, of which the atomic elements correspond to elementary 
commands in the programming language and the operations to program compos- 
+ itions; the rich (lattice) structure of relations in the model provides a framework 
for devising a strategy for generating sensible candidate programs and for selec- 
ting the desired solution. Thus the specific strategy of formation of programs grows 
directly from reasoning within the model, although it is the grammar that deter- 
mines the programming units and their possible modes of aggregation. 

At this point the question arises whether Amarel’s work is a study in artificial 
intelligence, whether it is even a prolegomenon to a possible program. To answer 
this question one should note that there are three major steps in the transition 
from the original representation of Amarel’s problem to its improved representa- 
tion within which formation of the desired solution can proceed. First, a theore- 
tical model of the overall possible program space is found; then the potentially 
fruitful properties of this model are elucidated; last, the model is used in for- 
mulating an overall (derivational) problem solving strategy and in building a 
repertoire of problem solving moves. The mechanisation even of the third step 
is challenging, although it is possible that automatic ways of exploring and exploit- 
ing (for instance) horizontal and vertical relations in a lattice structure might be 
found. The mechanisation of the second stage is more difficult, since it involves 
the notion of ‘interesting’ theorems within the mathematical model, namely 
theorems that are likely to contribute to the formulation of a powerful problem 
solving procedure. And, as Amarel points out, the mechanisation of the first step 
is the most difficult of all. One cannot even assume that one chooses from the set 
of existing models, for it may be necessary to develop a new type of mathematics; 
Amarel himself extended the existing algebra of relations to allow for a type of 

* cascading that occurs in programs. 
In other words, it is easier to mechanise the use of powerful representations 
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than it is to automate their discovery, particularly when those representations have 
been produced with the ultimate goal of automation in mind. Artificial intelli- 
gence, at least in its current state, leaves the task of discovering such representa- 
tions to the human being, and relies on him also to express problems appropriately 
before they are input to the machine. Thus the Graph Traverser can handle only 
problems that are amenable to representation as a graph, and reports of work 
using the Graph Traverser take it for granted that this representation has been 
effected, so providing a matrix within which problem solving can begin. But 
although the program’s problem solving begins at this point, human intelligence 
is initially employed to translate the significant aspects of the commercial trav- 
eller problem from English into mathematical form, or to recognise that a puzzle 
concerning the movement of blocks sliding within a rigid frame can be expressed 
as a conundrum in graph theory with which the program can then deal. Similarly, 
it was the logical intuition of the GPS programmers which saw significance in the ° 
fact that a term appeared in the putative conclusion which did not appear in all 
the premises, so suggesting a simple form of planning (so simple that it could be 
automatically achieved by the machine) as an efficient reformulation of the prob- 
lem. 

Indeed, the authors of a recent program designed to achieve results naturally 
reached only by human intelligence apologise for offering their chapter to a 
volume of papers on machine intelligence, because the novel representation of 
their particular problem, like the formal rules for using it, was discovered by- 
themselves and not by their programs. Longuet-Higgins and Steedman! are 
interested in the introspectively obscure process of musical interpretation tacitly 
carried out by a listener who hears an unknown melody and ‘intuitively’ assigns 
its metre and key so that he could write it down correctly in musical notation. 
There is more to this than finding the correct note-lengths and the correct posi- 
tions of notes on the keyboard: starting with a middle-A crotchet, one could 
write God Save the Queen either in 4/4 time in the key of B-flat, or in 3/4 time in 
the key of A. It is intuitively obvious even to a musical novice that the second 
form is correct and the first inappropriate. But it is no easy matter to state pre- 
cisely why this is so, or to explain how it is possible for the correct metre and key 
to be established on first hearing. Longuet-Higgins and Steedman attempt to 
provide formal rules for transcribing the fugue subjects of Bach’s Well Tempered 
Clavier, detecting both the harmonic relations between the notes and the metrical 
units into which they are grouped. Their programs mirror the progressive 
character of musical comprehension, in that ideas about metre and key 
become more definite as the fugue subject proceeds, and may crystallise 
well before the end. Their attempt is not wholly successful, since their metrical 
program cannot deal with cases where all notes and rests are of equal duration 
(6 out of the 48 fugues), and it makes mistakes in six other cases (in four of which 
a musician might commit the same error). But their harmonic algorithm finds 
every key sign and notates every accidental correctly. This algorithm works on 
the basis of a two-dimensional representation of harmonic relations. The spatial | 
relationships within a.lattice-array of numbers encode the basic musical intervals 
of a perfect fifth and a major third, and each key is represented by a box of 
‘major’ or ‘minor’ shape superimposed on the lattice at a certain position. Ex- 


1 Meltzer and Michie [1971], chapter 15. 


i How Artificial is Artificial Intelligence? 69 


ploitation of these spatial relations—together with a simple rule of tonic-dominant 
preference for the first note, which is needed only in those cases where the lattice 
allows for several possibilities—assigns all 48 fugues to their correct keys. 
Further rules allow for the detection of ‘accidental’ notes outside the original 
key and for modulation from one key to another. Considered as a whole, the 
harmonic algorithm appears somewhat contrived: although some of the rules 
seem ‘natural’ in the sense that they may tacitly contribute to one’s everyday 
appreciation of fugue, there is no suggestion that one’s interpretative competence 
utilises a subliminal two-dimensional array of numbers with ‘major-shaped’ and 
‘minor-shaped’ boxes. That is, this representational lattice is more appropriate to 
a machine than to a man. Moreover, the program treats only of ‘dead-pan’ 
performance and so entirely omits certain aspects of musical appreciation. Never- 
theless, it remains true—and is perhaps surprising to those who doubt the a 
` priori possibility of formalising human thinking—that it allows for absolutely 
correct harmonic notation of the Forty-Eight. The power of the harmonic 
algorithm is surprising even though it admittedly derives from the musical 
competence of the programmers and their natural intuitions about (for example) 
the importance of perfect fifths and major thirds. 

Devotees of the information processing approach to intelligence assume that 
the impossibility of mechanising the discovery of appropriate representations is 
merely temporary. This partly accounts for the fact that programs that embody 
‘previously articulated scientific theories are often described by their programmers 
as preliminary exercises in automatic theory construction, although actually they 
are nothing of the kind. For example, the Dendral program of Buchanan and 
Feigenbaum! incorporates the process by which an analytical chemist identifies 
an unknown organic compound by means of mass spectroscopy. Like the chemist, 
the program formulates probable hypotheses about its molecular structure on the 
basis of its spectrograph, and then tests these hypotheses by way of further pre- 
dictions, The program has been of real use to chemists in giving for the first time 
a complete list of the set of possible isomers of a given empirical formula within 
several families, including amines and thioethers; and its performance as a prac- 
tising analyst compares favourably with that of expert chemists for certain 
classes of compound. Moreover, the authors’ detailed descriptions of the develop- 
ment and use of the various sections of the program is interesting not only for 
its relevance to basic problems in artificial intelligence—some of which were 
initially raised in connection with GPS—but also for the light it throws on 
scientific reasoning as exemplified in this area of organic chemistry. However, 
since the theory of mass spectrography had to be cleverly elicited from specialist 
chemists so as then to be represented in a form suitable to the LISP program- 
ming language, the authors’ claim to be studying the processes of scientific infer- 
ence whereby models are constructed to explain data can be accepted only with 
respect to the natural intelligence involved. One should be equally wary of Hunn 
and Lederberg’s disclaimer concerning their Genetic Counsellor, which diag- 
noses the relevant genetic make-up of prospective spouses on the basis of their 
families’ medical histories and then assesses the probability of inherited defects 
in their offspring: ‘the program is an initial, very limited, emulation of the induc- 
tive process in that it does not generate its own hypothesis, but selects from a set 

1 Meltzer and Michie [1969], chapter 14, and [1971 , chapter 12. 
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based on genetic theory. We make few claims for its standing as an example of 
automated induction in its present form.’ This remark suggests that a more 
sophisticated version of the same basic program could simulate the ‘inductive’ 
process of scientific discovery—a suggestion that can hardly be taken seriously. 
But the possibility that some program or other might in principle do so is less 
obviously absurd. In a general discussion of the relation between natural and 
artificial intelligence some years ago, Scriven? envisaged a ‘compleat robot’ with 
creative originality equal to that of human beings, based on analogy-assessing 
procedures whose development he declared to be ‘feasible but difficult’. Such a 
robot presumably would be able to recognise for itself the analogy between a 
sliding-block puzzle and a graph, or between a given programming language and 
a specific algebraic system, and could effect the required transformations of the 
original problem; similarly, it could originate scientific models as well as finding 
novel representations (in LISP, for instance) for familiar scientific theories. ' 

That Scriven was in many ways over-optimistic could be admitted even by 
like-minded individuals: Levy, for example, points out that artificial chess-playing 
has reached stalemate rather than checkmate. But some critics of artificial 
intelligence hold that Scriven and his intellectual fellows are radically misled in 
their hopes for the mechanisation of thought. Such critics would argue that 
scientific discovery, and the problem translations previously described which 
are the crucial steps in solution of the relevant problems, all involve an insight of 
the creative imagination that could not possibly be programmed or incorporated ina’ 
purely digital machine. Dreyfus, for example,’ insists that a machine operating 
on determinate, unambiguous bits of data according to strict rules in principle 
could not parallel all the thought processes effected by the human mind, since 
men can handle essentially indeterminate information also. He rejects Simon and 
Newell’s assumption that all human thinking can be described in a digital formal- 
ism, and relates this assumption to the Western philosophical tradition which 
believes—with Plato—that all knowledge could be stated in explicit definitions. 
He prefers the phenomenological tradition, claiming that the recognition of 
significance in any situation depends on an intuitive interpretation with refer- 
ence to a specifically human context or horizon ‘which gives to facts their 
significance but need not itself be analysed in term of facts’. Even those problems 
which can apparently be solved purely by recourse to formal rules would be said 
by Dreyfus to involve such interpretation, for he appeals to Wittgenstein’s notion 
of forms of life in rejecting the assumption that—in the last analysis—there could 
be rules for interpreting rules. Polanyi also favours a phenomenological approach, 
and regards a cyberneticsimulacrum of thought as impossible since human reason- 
ing employs integrative principles of tacit inference or global knowledge of which 
one is not introspectively aware, but which crucially determine the nature of the 
thought-contents of which one is focally aware. This tacit knowledge is said to be 
indeterminate, in the sense that its content cannot be explicitly stated. Or, 
rather, its complete statement is impossible in explicit terms: ‘tacit knowing is 
the fundamental power of the mind which creates explicit knowing, lends mean-, 
ing to it and controls its uses. Formalisation of tacit knowing immensely expands 
the powers of the mind, by creating a machinery of precise thought, but it also 

1 Meltzer and Michie [1971], chapter 13. 3 Crosson [1970], chapter 5. 
3 Crosson [1970], chapter 7. 4 Ibid., chapter ro. 
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opens up new paths to intuition.’ Dreyfus and Polanyi would approve, if not 
accept, Longuet-Higgins’s apology for including an example of ‘the formalisation 
of tacit knowing’ in a book nominally concerned with machine intelligence. 
Some of the difficulties remarked by these radical critiques of artificial intel- 
ligence have been noted also by workers within the field, with varying degrees of 
confidence in their eventual solution. Thus Neisser! a decade ago criticised the 
currently prevailing imitation of man by machine for neglecting the largely 
unconscious emotional context and motivational interests which give to human 
thought its direction and its significance. In his own programming work he has 
tried to allow for these typically human dimensions of thinking, at least in a pre- 
liminary fashion, eschewing the artificiality that results from the common 
cybernetic assumption that intelligence can be regarded in isolation from the 
whole context of personal life. Mackay? has also emphasised that ‘the only class of 
* artificial méchanism worth considering as a candidate for “mentality” ? would show 
analogues of personal qualities such as hoping, fearing, and feeling. In his view 
however, no cybernetic achievement within his own laboratory or any other 
could enable one definitively to decide whether machine intelligence is artificial 
or real, since this personal question is answerable only (by man or machine) in the 
first person. Even programs conceived entirely without reference to personal 
dimensions of intelligence—such as Kelly’s edge finder—have taken account of the 
need to use global information and to know what one is looking for in order, for 
‘example, to distinguish significant from insignificant pictorial details. In general, 
programs aimed at realistic pattern-recognition are increasingly based on a meth- 
odology which assumes that introspectively ‘spontaneous’ perception is intrinsic- 
ally structured, and that knowledge both of general and of specific object-proper- 
ties enters actively into their recognition. For instance, Huffman’ has tried to 
formalise the constraints to which one tacitly appeals in interpreting pictures and 
perspective views of physical objects, and by means of which one intuitively 
, recognises’ certain drawings as depicting ‘impossible’ or ‘nonsensical’ objects. 
Although he concentrates on plane polyhedra, he also considers smooth curved 
objects like saddles and pieces of cloth or paper that can be folded, creased, and 
crumpled. Global information about the inter-relations of significant parts of the 
picture contributes to the interpretation of those parts—for example, to the 
interpretation of a line as representing a convex or a concave edge in the physical 
surface portrayed. Global information is stressed too by Barrow and Popplestone* 
and by Guzman® who in addition discuss the difficulties involved in providing 
the machine with an internal model of a particular class of objects sufficiently 
flexible to enable the program to recognise new instances by analogy with familiar 
ones. That these difficulties are immense, whether or not they are insuperable, 
is indicated by the fact that Guzman finds it necessary to include a specific pro- 
vision for a dent in order that his model of kats should mediate recognition of a 
Homburg as well as of a bowler. Finally, Hayes®? points out that the inner repre- 
sentations of the environment that are crucial to intelligent action must be 
„decidedly more complex than the models currently assumed in artificial 


1 Ibid., chapter 9. 3 Ibid., chapter 6. 
3 Meltzer and Michie [1971], chapter 19. 4 Ibid., chapter 2x. G 
5 Ibid., chapter 20. è Meltzer and Michie [1969], chapter 28. 


7 Meltzer and Michie [1971], chapter 25. 
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intelligence work, particularly in theorem-proving systems based on first-order 
logic. A robot equipped only with the resolution formulation of first-order logic, 
as current robots typically are, could not act sensibly within the real world. An 
intelligent robot would need a description of the way in which truth-values 
change as it moves through time, which requires not only an adequate temporal 
logic, but also a causal logic encoding a vast knowledge of natural physics that 
could not be axiomatised, even though it might be internally represented in some 
way. The solution of problems which assume a trivial physics—like those handled 
by the Graph Traverser, where operations (‘actions’) have no side-effects—does 
not suffice to cope with decisions about real life actions. For in the latter case a 
central difficulty is to distinguish those properties of an object (and of its sur- 
roundings) that one expects to remain unchanged by one’s prospective action 
from those properties which may reasonably be expected to alter. In other 
words, anticipatory planning can be a very complex matter. Sensible action can * 
be carried out only if prior planning can utilise knowledge about the causal rela- 
tions within the world. However Hayes’s pessimism about the axiomatisation of 
such everyday knowledge, and so about the adequacy of classical theorem- 
proving approaches to mechanised reasoning, does not imply that he regards 
artificial intelligence as impossible. He assumes that alternative logics and, 
especially, greater tolerance of potential (though not actual) inconsistency between 
different aspects of the representations manipulated in thinking might be incor- 
porated in machines as they apparently are in human brains. 

In sum, one may argue a priori for or against the possibility of artificial intel- 
ligence; and one may examine current attempts at mechanised thought, asking 
how artificial they actually are—in which case one’s answer should be continu- 
ally revised. The three collections of papers that I have reviewed together 
constitute a base of sources from which such enquiries can proceed. Anyone who 
expects it to be easy to answer either question may find it salutary to remind 
himself that the lexicographers of Chambers’s Dictionary could not decide 
whether by ‘artificial’ Shakespeare meant ‘creative’ or ‘merely skilful’. 


MARGARET A. BODEN 
University of Sussex 
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The philosophical worth of the volume is primarily to be found in the contribu- 
tions of the editors, and of L. de Broglie, H. J. Treder, E. Wigner, J. Park and 
H. Margenau, and K. R. Popper. As measured in terms of the average density of 
new philosophical insights per page, the philosophical interest of this volume is 
not very great. Many of the articles are of a purely technical nature, and will 
presumably be of interest only to physicists: A. Kastler, D. Bohm, H. Hénl, 
D. Caldwell and H. Eyring, O. Costa de Beauregard, F. Bopp, and J. P. Vigier on 
certain topics in atomic theory, thermodynamics, and elementary particle theory. 
There is another group of articles by W. Elsasser, L. Rosenfeld, H. Bondi, 
1 Review of W. Yourgrau and A. van der Merwe, (eds.): Perspectives in Quantum Theory. 
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A. Mercier, and P. Bernays, which concern the philosophical aspects of quantum 
theory, but seem to me to be of insufficient interest to warrant detailed discussion 
here. The editors’s introduction begins with a history of Landé’s contributions 
to the theory of atomic spectra, and concludes with an excellent discussion of the 
more philosophical contributions to quantum theory which he produced in his 
later years. They summarise the principal elements of the Bohr-Heisenberg 
interpretation of quantum theory: the uncertainty principle’s limitations on the 
simultaneous measurability of position and momentum and the consequent 
limitations on the applicability of these classical concepts to atomic objects; 
the necessity to use certain pairs of concepts (e.g. wave and particle, position and 
momentum) which are equally necessary to the exhaustive description of atomic 
phenomena, but which cannot be combined into a consistent picture of atomic 
objects; the impossibility of supposing that atomic objects follow definite 
* trajectories between observations and the consequent necessity to speak only of 
the results of interactions with measurement-devices and not of any independent 
behaviour of atomic objects. 

Against all of this, the editors explain, Landé maintained that from the fact 
that the probability density of an electron may have a spatial pattern similar 
to that of the intensity of a wave, it does not follow that an electron is a wave. 
Furthermore, the uncertainty principle refers to the dispersions in measurements 
on members of ensembles of particles, and not to the precision of measure- 
ments on individual particles. Hence, it does not restrict simultaneous measur- 
ability of position and momentum, or applicability of those concepts. There is 
therefore no obstacle thus far to the common-sense and classical view of matter 
as composed of particles with definite trajectories. To deal with the famous 
objection that the particle view implies that an electron passing through one slit 
in a screen is sensitive to the presence or absence of a second slit, Landé applied 
the Duane-Ehrenfest-Epstein theory of the dependence of the probabilities of 
momentum transfers upon the matter distribution for the entire screen. 

The editors mention Landé’s rejection of ‘the intrusion of subjectivity into 
quantum physics’ allegedly allowed by Bohr and Heisenberg, but do not make 
clear the sense in which the Copenhagen interpretation is supposed to be 
subjective, nor the grounds of Landé’s objections. They remark that Landé held 
a unitary view of matter as consisting of particles and a unitary view of radiation 
as consisting of waves. But they do not explain how the latter view can accom- 
modate the manifest particulateness of photons. Finally, they briefly describe 
Landé’s attempt to reconstruct quantum theory, on the basis of three simple 
‘non-quantal’ postulates. 

Most of the remaining contributions I shall discuss deal with themes central 
to Landé’s thinking about the interpretation of quantum theory. In his contribu- 
tion, de Broglie explains that he also rejects the Copenhagen view that atomic 
objects have both wave-like and particle-like aspects, though in incompatible 
experimental situations. Like Landé, he also regards ‘W’ as designating a purely 
mathematical object (a probability amplitude) and not a spatially extended 
wave-like object. However, he does hold that there is an entity of this latter kind 
associated with each particle. In addition to the wave-function, then, there is 
‘a physical wave of very weak amplitude whose essential role is to guide the 
motion of strong local concentrations of energy constituting the particles’ (p. 8). 
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This was what he originally had in mind in 1923 when he first suggested 
associating a wavelength à= h/p with each particle of momentum p. It was only 
later, through the work of Schrödinger and Born, that the ‘wave’ came to have a 
statistical significance. De Broglie, however, has persisted in developing his 
original line of thought, and outlines his results through 1968 here. Rather 
poignantly, he remarks that not many researchers have thus far shown an 
interest in his approach, and that his age does not permit him to hope that he 
will be able to carry it much further himself. 

Treder contributes an interesting discussion of the well-known argument 
between Einstein and Bohr about whether energy and time can be measured with 
arbitrary precision. Bohr claimed that Einsteins proposed measurement 
technique ignored the shift in clock readings required by his own general 
relativity theory, but Treder argues that this reply is irrelevant. 

Wigner considers a question closely related to Landé’s view that particles ° 
may have definite positions and momenta: the question whether there exists a 
function P(g, p) giving the probability-density for values g and p of position 
and momentum. He also asks whether reasonable conditions can be imposed on 
P(q, p) which result in its being uniquely determined by ¥. He first suggests the 
condition that there exists a self-adjoint operator M(q, p) such that 


P(g, p) = (+, Mla, PY). (1) 

He also suggests the conditions 
[Pa 2) 42 =| wa) O 
P(g, p) > 9, (3) 


[fee nroremnga=(u [Eo]  @ 


and then shows that they cannot all be satisfied. 

He then chooses to abandon the third requirement and try to discover what 
additional constraints suffice to determine a unique P(q, p) for each p. He finds 
that certain invariance and symmetry conditions do not suffice alone, but that 
they do in conjunction with the requirement that 


The abandonment of (3), non-negativity of P(q, p), is a peculiar choice, since 
it obviously disqualifies P(g, p) as a probability density, and thereby deprives 
the search for a P(g, p) of any intuitive motivation. One would have thought 
that (4) would be the first to go, since it embodies the assumption which Bell 
[1966] criticised von Neumann for attributing to hidden variable theorists. The 
assumption is that any sum of operators represents an observable, whose value 
is the sum of the values of the observables corresponding to the operators in the 
sum. Bell showed that this condition does not in general hold when the summed 
operators do not commute. Accordingly, when the observable represented by 
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has the value f(p), and that represented by g(x) has the value g(q), there is no 
reason to expect the sum of these operators to represent an observable with the 
value f(p)+g(q). Hence, there is no reason to expect (4) to hold. Margenau 
and Park [1968] have also denied the reasonableness of (4) by providing grounds 
for denying that the observable 2+¢ corresponds to the operator A+B if A 
and B do not commute. 

The contribution of Park and Margenau to the Landé festschrift is a shortened 
version of their [1968]. Perhaps its publication here will help it reach the wide 
audience it deserves. The paper concerns the empirical consequences of the 
noncommutability of certain quantum-mechanical operators. Like Landé, they 
reject the inference from the uncertainty principle to restrictions on simultaneous 
measurability of observables whose operators do not commute. Likewise, they 
reject the projection postulate and von Neumann’s measurement transformation, 
which also imply incompatibility of such measurements. They show, however, 
that standard quantum-theoretical postulates do imply that if a and ¢ are 
simultaneously measurable, then the corresponding operators commute, as 
von Neumann showed without using the premises just criticised. 

Park and Margenau then give two counter-examples to show that von 
Neumann’s theorem is false. The first involves measurement of two positions 
and the travel-time between them, which yields both position and momentum 
at the end point. This example is similar to one suggested by Heisenberg [1930] 
and rejected by him for reasons which Park and Margenau effectively criticise. 
Though one counter-example is all they need, it is unfortunate that their 
second one is extremely dubious. An example of the kind of case they have in 
mind would be the situation in Bohm’s [1951] version of the Einstein-Podolsky- 
Rosen paradox. Two spin $ particles initially form a system with total spin 
equal to o. After they split apart, we can determine what a measurement of s} 
would yield by measuring s?. (The superscripts distinguish the two particles.) 
Margenau and Park therefore interpret a measurement at time ¢ of sl as being 
also a measurement of the value of s? at t, regardless of whether particle 2 has 
itself interacted with a measuring device. They then claim that s? can be measured 
directly at ¢ to obtain simultaneous values of 32 and s2. 

However, one implication of the work on hidden variables by Kochen and 
Specker [1967] and Bell [1964] is that measurements of spin do not in general 
yield values of spin possessed before the measurement. (See Gardner [1972a].) 
Hence, from the fact that a measurement of s? at or after t would yield a certain 
value (with unit probability), it does not follow that at t, s? already has that value. 
If a measurement of s? is taken at ż, the difficulty is compounded, since this 
measurement may alter the value which s? would have displayed if it were 
measured at ¢. Hence, a measurement of sl and s? at ¢ cannot be interpreted as 
yielding values of s and s? at t. 

Having presented their counter-examples to von Neumann’s compatibility 
theorem, Park and Margenau argue that the best way to make it a non-theorem 
is to eliminate the standard postulate that every ppecrrable corresponds to an 
Hermitian operator. 

A serious confusion in Park and Margenau’s generally deaehaaiba position 
concerns their notion’of the ‘latency’ of observables. Many quantum theorists 
agree with Bohr and Heisenberg that an observable only has a value when being 
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measured, or only when the system is in an eigenstate of that observable. 
But I know of no one besides Park and Margenau who hold (at least sometimes) 
that observables never have values—t.e., that they are never quantitative proper- 
ties of atomic objects: 


‘Thus for the “quantum billiard ball”, it is proper to speak of the numerical 
results of position, momentum, energy, or angular momentum measure- 
ments, but it is improper to interpret these numbers as past, present, or 
future properties of the ball.’? 


Thus, observables are ‘latent’. They seem to espouse the same view in the Landé 
Festschrift (p. 58). But on the next page of their [1968], they say that an observable 
is a property of a system when being measured, or when the system is in an 
eigenstate of that observable. Thus, Park and Margenau not only contradict 
themselves flatly, but espouse here the Copenhagen interpretation, of which they 
are among the leading critics! Park, for example, has shown elsewhere (in his 
[1968]) that the assumption just mentioned—that the state has not merely a 
statistical significance, but determines what properties an individual system 
possesses—leads to a well-known disaster in the case of Schrédinger’s cat. 

The contribution by Popper is the first publication by him I know of in which 
he criticises not only certain philosophical excrescences of the quantum theory, 
but the basic assumptions of the theory itself. In his distinguished earlier work 
on quantum theory (Popper [1935] and Popper [1967]), he criticised wave- 
particle complementarity, misinterpretations of the uncertainty principle, and 
the projection postulate. But here he claims that the fact that quantum theory 
faces numerous unsolved problems relating to elementary particles and gravita- 
tion, and the fact that it implies an extremely implausible kind of action at a dis- 
tance, point ‘to the need for a more general theory’ (p. 184). The point about non- 
locality is based upon Bohm’s version of the Einstein-Podolsky-Rosen paradox, 
which I mentioned earlier. However, Popper seems to mistake the import of the 
paradox. He first claims that it refutes the version of the Copenhagen interpreta- 
tion according to which an observable has a value if and only if it is being 
measured. His argument is that on this view a measurement of sl would instantly 
give a value to s2, even though the two particles are widely separated. But this 
argument is inconclusive, since the Copenhagen theorist may deny that s2 has 
been measured until particle 2 interacts with a measuring device. This sort of 
argument is only effective against the version of the Copenhagen interpretation 
which holds that an observable has a value if and only if the system is in a 
corresponding eigenstate, since a measurement of s! indisputably places particle 
2 into an eigenstate. (See Gardner [19725].) 

Popper’s main point, however, is that in a case (involving, e.g. spin or polarisa- 
tion) where a measurement yields a value different from that possessed before 
the measurement, non-locality is a consequence not of some version of the 
Copenhagen interpretation, but of quantum theory itself. His argument is that 
if in such a case ' 


‘the dicigagarament® of S; [#.e. photon 1] consists in S, passing through a 
` polarizer, and if this “measurement” of S, informs us about the state [i.e. 


1 Park and Margenau [1968], p. 217. 
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polarization] of S,, then the kind of action at a distance described by Ein- 
stein is not merely part of . . . Bohr’s interpretation—but part of quantum 
mechanics itself.’ (p. 188). 


This argument is extremely puzzling. Finding out something about S, by 
means of a measurement on S, does not in itself constitute action at a distance, 
whether or not S; is altered by the measurement. Learning about S, does not 
imply causing something to happen at S,. Perhaps Popper has in mind the argu- 
ment of Dicke and Wittke [1960], whom he cites in this connection. What they 
find paradoxical about this situation is that, 


‘a plane-polarization measurement made on one photon enables us to 

predict that the other photon is plane-polarized and, further, to determine 

the direction of its polarization. On the other hand, a measurement of the 

circular polarization of the first photon enables us to predict that the other 

photon is circularly polarized and what the direction of its circular polariza- 

tion is.... But it is very difficult to see how this measurement can be 
` thought of as affecting the polarization of the other photon.’! 


But quantum mechanics without the Copenhagen interpretation—that is, 
with the minimal statistical interpretation (Gardner [19725])—does not imply 
that a measurement of S78 polarisation tells us what S,’s polarisation is. It only 
tells us what result we would get if we were to measure S,’s polarisation. And 
since Popper thinks it so important that a measurement of polarisation may yield 
a value different from that possessed before the measurement, he cannot very 
well identify the polarisation with the result which a measurement would yield 
if it were taken. 

Though Popper discusses Bell’s theorem? on the non-existence of local 
hidden variables, it is strange that he does not use it to support his claim that 
quantum theory is non-local. Bell (see also Wigner [1970]) showed that in the 
case of the correlated spin $ particles mentioned above, it is inconsistent 
with quantum theory to suppose that the correlation ‘is due to information 
carried by and localised within each particle, and that at some time in the past 
the particles constituting one pair were in contact and communication regarding 
this information’, as Clauser et. al. [1969] put it. Thus, quantum theory implies 
that outcomes of measurements on the two particles are correlated, despite the 
fact that the outcomes are not determined during the time the two particles can 
affect each other. This result may fairly be described as paradoxical, and might 
even lead to the kind of dissatisfaction with quantum theory voiced by Popper 
for other reasons. 

Popper claims that Bell’s proof as it stands applies only to deterministic and 
not to stochastic hidden variables. ‘Deterministic’ presumably means that the 
values of the hidden variables at any time lawfully determine all subsequent 
values. However, though Bell says that ‘in a complete physical theory of the 
type envisaged by Einstein, the hidden variables would have dynamical signifi- 
cance and laws of motion’, he makes it clear that his proof considers only their 
values at a single instant, and accordingly makes no assumption about unger 
their temporal evolution is deterministic. 


1 Dicke and Wittke [1960], p. 120. 
2 Bell [1964]. 
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Despite all these reservations, I think readers of this volume who confine 
their attention to the articles discussed above will at least find the book pro- 
vocative. 

MICHAEL R. GARDNER 
Mount Holyoke College 
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UNDER THE SPELL OF BOHR* 


Quantum Theory and Beyond is a collection of essays and discussions based on the 
proceedings of a colloquium held at Cambridge in July 1968. The following year 
Hilary Putnam published a paper, ‘Is Logic Empirical?’ which, together with 
the work of Kochen and Specker,? in my view throws an altogether new light 
on the central problem discussed in this book: since my own present perspec- 
tive is consequently very different from that of the contributors to the 1968 Cam- 
bridge Colloquium, I shall first state the central problem as I now see it. . 

The fundamental problem in the foundations of quantum mechanics is the 
completeness problem. Roughly, what is at issue is the distinction between a purely 
epistemic ‘ignorance’ interpretation of the statistics, and an interpretation of the 
theory as ‘irreducibly statistical’. I suggest that a proper understanding of the 
sense in which quantum mechanics is a complete statistical theory leads to an 
immediate solution to the measurement problem. I shall preface my remarks on 
Quantum Theory and Beyond by a brief discussion of this thesis. 

A statistical theory may be regarded as characterised by a set of physical 


*Review of Ted Bastin (ed.),: Quantum Theory and Beyond, Cambridge 1971: Cam- 
bridge University Press, £5.00. Pp. viii + 345. 

1 Putnam [1969]. 

2 Kochen and Specker [1967]. 
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magnitudes forming an algebraic structure of a certain kind, together with an 
algorithm for assigning probabilities to ranges of possible values of these magni- 
tudes. There is a 1-1 correspondence between the set of idempotent magnitudes 
and the set of theoretical sentences: each idempotent magnitude corresponds to a 
theoretical sentence which is true if an only if the value of the magnitude is 1, 
and false if and only if the value of the magnitude is o. That is to say, the theory 
involves a set of ‘statistical states’ which assign probabilities to theoretical 
sentences val (A) e S. (Read: The value of the physical magnitude A lies in the 
set S of real numbers.) I shall refer to the given algebraic structure of the idem- 
potent magnitudes as the logical space of the theory. A statistical theory is com- 
plete if and only if the statistical states of the theory generate all possible proba- 
bility measures on the logical space. 

Now, the algorithm of quantum mechanics involves the representation of the 
statistical states of the theory by a certain class of operators in Hilbert space, the 
statistical operators, and the physical magnitudes by hypermaximal Hermitian 
operators in Hilbert space. The idempotent magnitudes are represented by the 
projection operators. Each statistical operator assigns a probability to the sentence 
val (A) e S according to a certain rule. The peculiarity of quantum mechanics as 
a statistical theory lies in the fact that the logical space is not a Boolean algebra, 
but a partial Boolean algebra. Essentially, a partial Boolean algebra is a partially 
ordered set with a reflexive and symmetric (but not necessarily transitive) rela- 
tion, termed ‘compatibility’, such that each maximal compatible subset is a 
Boolean algebra. (Thus, a partial Boolean algebra may be pictured as ‘pasted 
together’ from its maximal Boolean sub-algebras.) A probability measure on a 
partial Boolean algebra is any assignment of values between o and 1 to the elements 
of the algebra, which satisfies the usual conditions for a probability measure on 
each maximal compatible subset of the algebra. 

The completeness problem for quantum mechanics was solved by Gleason.? 
Gleason’s theorem states that in a Hilbert space of 3 or more dimensions, all. 
possible probability measures on the partial Boolean algebra of subspaces 
(isomorphic to the partial Boolean algebra of projection operators or idempotent 
magnitudes) may be generated by the statistical operators according to the quan- 
tum mechanical rule. Kochen and Specker* pointed out a corollary to Gleason’s 
theorem: Because there are no dispersion-free statistical operators, no dispersion- 
free probability measure is definable on the logical space, except in the case of a 
2-dimensional Hilbert space. 

A dispersion-free probability measure satisfies the condition 


Exp(A*) = (Exp(A))? 


for every magnitude A, where Exp(A) is the expectation value of the magni- 
tude A. Equivalently, 


PS) = PL) 


1 This term is due to Kochen and Specker, who have investigated the properties of these 
algebraic systems. See Kochen and Specker [1967]. 

2 Kochen and Specker, ibid., use the term ‘commeasurability’. This suggests a particular 
and, I think, misleading interpretation of the relation. 

3 Gleason [1957]. 

4 Kochen and Specker, op. cit. 
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for every sentence s in the logic, i.e. p(s)—=1 or o. Thus, a dispersion-free proba- 
bility measure amounts to a bivalent valuation on the logic which is required to 
satisfy the usual semantic definition of the logical connectives for compatible 
sentences only, i.e. such that 


o(—s)=1—0(s) 
us A t)=v(s)v(z) if s and ¢ are compatible, 
from which it follows that 
o(s V t)=0(s)+-0(£) — o(s)v(2) if s and t are compatible. 
I shall call this assignment ‘of truth values a ‘partial Boolean valuation’. 

The non-existence of partial Boolean valuations means that it is impossible to 
embed the logical space in a Boolean algebra. It follows that it is impossible to 
introduce a ‘phase space’, Z, and represent each physical magnitude by a real- 
valued (Borel) function on Z, in such a way that each maximal compatible set of 
magnitudes is represented by a set of phase space functions which preserve the 
functional relationships between the magnitudes. What cannot be achieved for 
magnitudes A,, Æ» . . . which are all functions of the magnitude B, i.e. A,=g, (B), 
A,=8, (B), ..., is that if B is represented by the phase space function fp, then 
A, is represented by the phase space function fe, (B)=g,(fg), Ag is repre- 
sented by the phase space function fe,(B)=g,(fg), etc. Thus, there is no phase 
space reconstruction of the quantum statistics which preserves the functional 
relationships between compatible sets of physical magnitudes,? 

The interpretation I have sketched above of the role of the Hilbert space in 
quantum mechanics may be called the logical interpretation. It was proposed 
by Hilary Putnam, following Finkelstein.? On this interpretation, the partial 
Boolean algebra of sub-spaces of Hilbert space is taken as the logical space of 
micro-events. A vector in Hilbert space (or rather, a 1-dimensional subspace) 
then represents an elementary event, not a statistical state. Gleason’s theorem 
provides the solution to the problem of specifying all possible probability measures 
onsuch an event structure: the probability calculus on this partial Boolean algebra 
is generated by the set of statistical operators. 

Thus, the significance of the transition from classical to quantum mechanics 
is understood as the proposal—on empirical grounds—that the logical space of 
events (micro-events) is non-Boolean. Just as the significance of the transition 
from classical to relativistic mechanics lies in the proposal that geometry can 


1 The existence of (sufficient) partial Boolean valuations on the logical space is equivalent 
to the possibility of introducing a phase space. See Kochen and Specker, op.ctt., or Bub 
[1972]. Of course, the phase space Z need not be a phase space in the sense that it is 
parametrized by generalized position and momentum coordinates. It is a phase space in 
the sense that the points of this space define 2-valued probability measures, t.e. partial 
Boolean valuations, or assignments of values to the magnitudes satisfying the above 
functional relationships. That is to say, it is a phase space in the sense that the points of this 
space specify state descriptions in an analogous sense to the points of classical mech- 
anical phase space. 

2 Putnam [1969]. Putnam’s proposal should not be confused with the work on ‘quantum 
logic’ by mathematicians and physicists such as Birkhoff and von Neumann, Jauch and 
Piron, Mackey, Varadarajan, Gudder, etc., with the primary aim of providing new axio- 
matic foundations for quantum mechanics: For a discussion of this point, see Bub, 


op. cit. 


. 
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play the role of an explanatory principle in physics, that the geometry of events 
is not a priori, and that it makes sense to ask whether the world geometry is 
Euclidean or non-Euclidean, so the significance of the quantum revolution lies in 
the proposal that logic can play the role of an explanatory principle, that logic is 
similarly not a priori. 

The logical interpretation of the Hilbert space leads to an immediate solution 
to the measurement problem. This is the problem of accounting for the peculiar 
quantum mechanical rule for conditional probabilities—the so-called ‘projection 
postulate’. It can be shown that the projection postulate is the appropriate rule 
for the non-Boolean logical space of quantum mechanics.! 

The logical interpretation is, of course, not the usual interpretation of the 
Hilbert space. Usually, the Hilbert space is taken as the space of statistical states 
of the theory, with each unit vector representing a statistical state. ‘Mixtures’ 
of such ‘pure states’ may be defined as more general statistical states, specifying 
probability assignments representable as weighted sums of probability assign- 
ments generated by pure states. I shall refer to this as the statistical interpretation 
of the Hilbert space. 

Now, there can be no motive for the statistical interpretation other than the 
prejudice for a Boolean logical space, i.e. the underlying assumption is that the 
Boolean character of logic is a priori. On this interpretation, quantum mechanics 
is an incomplete statistical theory, because the statistical states do not generate 
all possible probability measures on a Boolean logical space. It seems that a new 
Boolean mechanics is required, a phase space reconstruction of the quantum 
statistics. But any such reconstruction will involve the measurement problem: 
the measurement problem is characteristic of the statistical interpretation. 

To show this, I shall use the notation p(s.t) for the probability that a first 
measurement satisfies the sentence s (że. s is found to be true in the measurement) 
and a subsequent measurement satisfies t; I reserve the notation p(s/t) for the 
conditional probability of s given t, and p(s A £) for the joint probability of s and £, 
i.e, the probability that the conjunction, s A t, is true. Generally in quantum mech- 
anics: 


P(s.t) # p(t) 
unless s and ¢ are compatible. In a Boolean or phase space theory, however, 
Ps) = p(®,) 
where ©, is the set of phase space points satisfying $, and ,(®,) is the measure of 
the set ®,. And 


a HOA ®,) 
Pl?) o HP) 


i.e. the probability of s given ¢ is the measure of the set of phase space points 
satisfying s in the set ,, with respect to a renormalised measure assigning pro- 
bability of 1 to the set ®;. But then 





ee p(sat) 
p(s/t) = 20 
1 See Bub [1971]. 
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and so 


P{t.s) = PDCA) = pisat) = pist) 


It follows that if we take p(f.s) as equal to p(#)p(s/t), the phase space rule for 
conditional probabilities must be violated in a phase space reconstruction of the 
quantum statistics. 

Instead of computing py(s/t) according to the rule: 


path = FESP wil @e) 


(where Pw is the probability assignment generated by the statistical state W), we 
must use the rule: 


Pw(s/t) = ew (P,) 
where uy is the phase space measure associated with the statistical state W’ and 
the transition WW’ is given by the projection postulate. 
The difference between py and uy is just this: p's is the original measure 
renormalised to the set ®,, whereas py is a uniform probability measure over the 
set ©,. Both measures assign probability 1 to the set ®,, i.e. 


ppl) = p (P,) = 1 


but the relative probability of subsets in ®, is different. According to the straight 
phase space rule, the relative probability of subsets in ®, is unchanged by the 
additional information that ¢ is true. But in a phase space reconstruction of the 
quantum statistics, we must assume that any initial information concerning the 
relative probability of subsets in ®, is somehow invalidated by the additional 
information that ż is true (or false). The fact that an initial probability measure is 
reduced or ‘collapses’ to the set ©, is not problematic here. What is problematic 
is that this reduction is accompanied by a randomisation process, i.e. the reduced 
probability measure becomes uniform over ®,. 

The major question of Quantum Theory and Beyond concerns the sense in which 
quantum mechanics is non-classical. All other questions are secondary: they 
depend on how one sees the quantum revolution. The book is uninteresting as a 
work in the philosophy of science, because the contributors have nothing new to 
say on the significance of the transition from classical to quantum mechanics. 
What is new here is the ingenuity with which old mistakes are disguised as 
fresh insights. I shall consider Bohm’s contribution in some detail, because his 
description of the motivation for the hidden variable approach, and his character- 
isation of a hidden variable theory, afford a very nice illustration of what I mean. 

For Bohm, the description of the world as ‘a union of disjoint elements’ 
(p. 33) is characteristic of classical physics. The common mistake of most 
physicists following von Neumann, he says, is to ‘continue to talk about a 
“quantum system” as if it were constituted of interacting components (e.g. 
particles which exist separately from each other, and from the instrument that 
is used in “observing the quantum state of the system”.’ (p. 33). Bohr’s insight, 
according to Bohm, was the realisation that ‘what is relevant instead is the whole- 
ness of the form of the experimental conditions and the content of the experimental 
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results.” (pp. 33-4). Bohm illustrates what he has in mind by a discussion of the 
Heisenberg microscope experiment. He points out that something is usually 
overlooked in an analysis of this experiment: 


“We note that from a particular set of experimental conditions, as deter- 
mined by the structure of the microscope, etc., one could in some rough 
sense, say that the limits of relevance of the classical description of the 
“observed object” are indicated by a certain cell in the phase space of the 
object... If, however, we had different experimental conditions (e.g., a 
microscope of another aperture, electrons of different energy, etc.), then 
these limits would be indicated by another cell in this phase space.... Both 
cells would have the same area, h, but their “shapes”? would be different. 
Now in the corresponding discussion of the classical situation, it is possible 
to say that the experimental results do nothing more than permit inferences 
about an observed object which exists separately and independently, in the 
sense that it can consistently be said to “have” these properties whether it 
interacts with anything else (such as an observing apparatus) or not.... 
However, in the “quantum” context the situation is very different. Here, 
certain relevant features of what is called the observed particle, i.e., the 
“shapes” of thecells in phase space, cannot properly be described except in 
conjunction with adescription of the experimental conditions. Nor can one 
say that the “shapes” correspond only to our lack of knowledge about the 
precise position and momentum of the observed object, considered as 
separate and disjoint from the overall experimental arrangement. . .. 
Therefore, the description of the experimental conditions does not drop out 
as a mere intermediary link of inference, but remains amalgamated with 
the description (both formal and informal) of what is called the observed 
object. This means that the “quantum” context calls for a new kind of 
description, which does not make use of the potential or actual separability 
of “observed object” and “observing apparatus”. Instead, as has been in- 
dicated earlier, the form of the experimental conditions and the content of 
the experimental results have now to be one whole, in which analysis into 
disjoint elements is not relevant’ (pp. 36-7). 


Evidently, Bohm sees this as the ‘revolutionary’ component in Bohr’s com- 
plementarity interpretation. But: 


Bohr went on to say that the terms of the discussion of the ene con- 
ditions and of the experimental results were necessarily those of “ordinary 
language” suitably “refined” where necessary, so as to take the form of classical 
dynamics. ... So, in a certain sense, Bohr takes quantum theory to be a 
certain kind of ‘‘generalisation” of classical theory. What is to be observed is 
always described in classical language (regarded as a refinement of ordinary 
“everyday” language), but the generalisation consists in replacing the class- 
ical algorithm (the differential equation applying to individual systems) by a 
+ quantum algorithm (matrix theory applying only to statistical ensembles). 
Since the classical language was supposed to be the only possible means of 
unambiguous communication, and since the terms of this language could not 
consistently be defined together, if one used the quantum algorithm’ for 
making statistical inferences in the usual way, he also concluded that it is 
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impossible to find any unambiguous language at all that could treat the order 
of occurrence of these statistical fluctuations as relevant. Therefore it would 
necessarily be a source of confusion merely to entertain the notion of “hid- 
den variables” in terms of which these contingent fluctuations would be 
revealed unambiguously in a new field of novel orders of necessity’ (p. 39). 


For Bohm this is the ‘conservative’ component in Bohr’s interpretation. Bohm 
proposes a hidden variable theory in which ‘one can discuss a possible new kind 
of significance for the order of successive operations (i.e. “measurements’’). 
According to current quantum theory, this order has to be “random”. Moreover, 
as one can see, the particular order determined by the contingent parameters 
(i.e. “hidden variables”) also could not be incorporated in any classical theory. 
Thus, if this order is significant, one will have to describe the experimental 
results themselves (and more generally, the experimental conditions, as well) in 
terms of a new language form that is neither “‘classical” nor “quantum” ’ (p. 40). 

Now, it seems that Bohm is proposing a very novel and radical thesis concerning 
the interpretation of quantum mechanics (in terms of the notion of “wholeness” ), 
as well as a very specific suggestion for going beyond the quantum theory, in a 
way that will allow the description of a certain domain of phenomena excluded 
from consideration by the current formulation of the theory. I have quoted 
Bohm at length because I want to show that this interpretation of quantum 
mechanics is no more than the statistical interpretation of the Hilbert space, that 
the proposed hidden variable theory (‘which implies a non-dynamical approach to 
the conceptual structure of physics, that is in many ways very different from that 
of classical physics’ (p. 102)), is simply a phase space theory in a quite straight- 
forward sense, and that any appearance to the contrary reflects only the fact that 
a certain way of talking has disguised the measurement problem, which is a 
necessary feature of this approach. 

On the statistical interpretation of the Hilbert space, quantum mechanics is 
incomplete: the statistical states of the theory do not generate all possible 
probability measures on the Boolean logical space of micro-events. It follows 
that the idempotent magnitudes of the theory cannot directly represent the pro- 
perties of micro-systems, and that the statistical states of the theory refer to a cer- 
tain set of statistical ensembles, each of which satisfies the uncertainty principle 
for the magnitudes of the theory, but not for the fundamental magnitudes which 
are determined by states defined as Boolean ultrafilters, or points in phase space. 
On this view, equivalence in the partial Boolean algebra of idempotent magni- 
tudes represents statistical equivalence, but not necessarily logical equivalence. 
That there is no Boolean valuation (or even partial Boolean valuation) on the 
partial Boolean algebra of idempotents is simply taken as a confirmation of the 
incompleteness of the theory. The complete theory will in general represent such 
statistically equivalent magnitudes as different phase space functions, while 
reproducing (in terms of appropriate phase space measures) the statistical equiv- 
alence for every Borel set S and every phase space measure corresponding to a 
statistical state in quantum mechanics. 

‘The various hidden variable theories all amount to different schemes for com- 
pleting quantum mechanics in this sense. They are ‘classical’ precisely because 
they involve the rejection of the logical interpretation of Hilbert space, and the 
stipulation that the structure of the logical space of micro-events is Boolean. 
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In his original hidden variable theory,! Bohm varied the distribution of hidden 
variables according to the kind of measurement involved. Now, tailoring the 
phase space probability measure to the relevant maximal Boolean sub-algebra in 
the logical space? is formally equivalent to introducing a fixed measure for each 
quantum mechanical state, and representing a single quantum magnitude (which 
is a function of various incompatible maximal? magnitudes) by different phase 
space functions, t.e. it is formally equivalent to replacing each theoretical sentence 
corresponding to a (non-maximal) idempotent quantum magnitude by a family of 
sentences (one for each maximal Boolean sub-algebra) in the Boolean logic. The 
distinction between Bohm’s original theory and the more recent Bohm-Bub 
theory described in Quantum Theory and Beyond (pp. 95-166) is just this: The 
Bohm-Bub theory introduces a fixed phase space measure for each quantum 
statistical state, and a different map associating phase space points with the truth 
values of sentences for each maximal Boolean sub-algebra. Evidently, relativising 
the bivalent valuations to a maximal Boolean sub-algebra is formally equivalent to 
introducing a different sentence for each maximal Boolean sub-algebra. 

The maps are generated by the ‘non-dynamical’ equation of motion, which 
involves the phase point and a set of operators which specify the maximal 
Boolean sub-algebra. Given the phase point and a maximal Boolean sub-algebra, 
the equation describes the transition to a new phase point which assigns particular 
values to the relevant set of compatible magnitudes, t.e. particular truth values to 
the sentences val(A)eS associated with the maximal Boolean sub-algebra. So, 
the non-dynamical equation of motion plays the role of an algorithm for assigning 
truth values to the sentences associated with any maximal Boolean sub-algebra, 
for any point in phase space. It is a simple matter to introduce a probability mea- 
sure on this phase space which generates the quantum statistics for each quantum 
statistical state. 

Bohm points out that ‘in a theory of spin operations, corresponding to what 
has been called “measurements of incompatible observables”, the “hidden 
variables” can be progressively limited and defined. In this process they cease to 
be “hidden” since they can now reveal themselves in a new kind of statistical 
distribution of results, not obtainable in terms of the current quantum theory’ 
(p. 112). I would paraphrase this as follows: The straight phase space rule for the 
calculation of conditional probabilities (e.g. the conditional probability of a 
certain spin value, given that an incompatible spin magnitude has a certain value) 
contradicts the quantum mechanical rule. Bohm continues: ‘If the contingent 
parts of the description (i.e. the “hidden variables”) can reveal themselves in the 
way described above through non-normal distributions, one may ask why they 
have not yet done so in experiments that have thus far been carried out. A very 
simple answer is that the contingent variables may be subject to some kind of 
stochastic process ’ (p. 112). Paraphrase: The quantum mechanical rule for con- 
ditional probabilities can only be recovered in a phase space theory by supposing 
that the probability measure is somehow randomised in the transition from an 
ipitial probability measure to a conditional probability measure. 

Evidently, the Bohm-Bub theory involves nothing more-than a particular set 


1 Bohm [1952]. : 
3 For each such sub-algebra corresponds to a ‘complete commuting set of observables’. 
3 I.e. ‘non-degenerate’. 
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of rules for generating the quantum statistics on a Boolean logical space. The 
measurement problem appears in the standard way, and there is no indication of a 
solution. Bohm has entirely missed the import of the completeness problem. He 
sees von Neumann’s result as excluding only a limited class of hidden variable 
theories, those satisfying a certain ‘linearity postulate’. Now in the first place, this 
assumption is avoided in Gleason’s theorem, or the work of Kochen and Specker. 
But more importantly, Bohm has failed to understand the relevance of von 
Neumann’s inquiry. He wonders: ‘Indeed, if the terms of the theory are assumed 
to provide a complete description of everything, what can its potential falsifi- 
ability possibly mean? Logically speaking that which lead to the falsification 
would have to be undescribable’ (p. 101). The notion of completeness Bohm is 
referring to here has absolutely nothing to do with the completeness problem of 
‘quantum mechanics, and could not possibly have interested a mathematician 
like von Neumann. 

Once the implications of the statistical interpretation of the Hilbert space are 
understood, constructions like the Bohm-Bub theory are seen to be pointless. 
It is quite obvious that such constructions are possible, and it is also obvious what 
their basic features are. Bohm’s insight that ‘certain relevant features of what is 
called the observed particle, t.e. the “shapes” of the cells in phase space, cannot 
properly be described except in conjunction with a description of the experi- 
mental conditions’ (p. 37) is simply the recognition that in a Boolean or phase 
space reconstruction of the quantum statistics, a different probability measure 
will have to be introduced for each maximal Boolean, sub-algebra. But this in no 
way justifies the fantastic conclusion that ‘it has no meaning to say, for example, 
that there is an “observed object” that interacts with the “observing instrument” ’ 
(p. 38). 

Bohm’s work on the foundations of quantum mechanics may be divided into 
three categories. Firstly, there is the quasi-philosophical analysis of ‘language 
forms’, ‘wholeness’, etc. (recently with Donald Schumacher).1 This inquiry I 
regard as confused and uncritical. Secondly, there is the work on hidden variable 
theories.? Now, Bohm’s original hidden variable theory was important because it 
drew attention to the completeness problem. By explicitly producing a phase 
space reconstruction of the quantum statistics, Bohm showed the untenability 
of interpretations of quantum mechanics which combine the statistical 
interpretation of the Hilbert space with the claim of completeness. Unfortun- 
ately, the critical function of the construction was not understood, and the ‘hidden 
variable theory’ was summarily rejected as a quite inadequate rival to quantum 
mechanics. But now the work of Gleason, and of Kochen and Specker, on the | 
completeness problem makes the construction of more ‘hidden variable theories’ 
of this kind redundant. And redundancy breeds confusion. Thus, the Bohm-Bub 
theory is proposed as a solution to the measurement problem, when it is no more 
than a statement of the problem. 

Thirdly, there is the work on discrete space-time structures in terms of the 
concepts of algebraic topology.? Now this research is of fundamental importance, 
but neutral with respect to the interpretation problem. The structure of space- 


. 1 See the list of references after Bohm [197%]. 
1 For references see Bohm [1958]. 
3 As far as I know, this work is available in the form of preprints only. 


. Under the Spell of Bohr 87 


time is a problem in its own right, and Bohm’s penetrating theories in this field 
are quite compatible with the logical interpretation of Hilbert space—they in no 
way presuppose the statistical interpretation. 

Actually, Bohm presented a very interesting paper on spinors? at the colloquium 
and did not discuss the Bohm-Bub theory. This accounts for Bastin’s otherwise 
puzzling remark in the introduction to the section containing Bohm’s article: 
‘He discusses a new approach to the spinor calculus which might be a first step 
in such a development. Bohm’s paper is followed by a section of the discussion 
to which it lead (p. 182). The discussion is on the spinor paper, not on the Bohm- 
Bub theory. 

There are several papers in Quantum Theor and Beyond dealing with combin- 
atorial or algebraic-topological studies of discrete space-time structures. The 
articles by Penrose and Atkin are fascinating and very readable, Hiley’s article is, 
an attempt to graft this research (including, of course, Bohm’s work in this field) 
on to the Bohmian wholeness jargon. It is a maze of non-sequiturs. The quantum 
mechanical measurement problem is related to the ‘wholeness of the description’ 
(p. 182), which is contrasted with the ‘local nature of the classical description’ 
(p. 185). Wholeness is the denial of ‘disjunction’: ‘this disjunction implies the 
possibility of analysis which, as we have seen, is not relevant in this context’ 

' (p. 182). The ‘impossibility of analysis’ is taken to mean that ‘we can no longer 
use any description containing the continuum notion, either explicitly or implic- 
itly, since this notion always leads to the possibility of subdivision into parts’ 
(p. 264). Now, homology theory provides a formalism appropriate for the des- 
cription of discrete processes, hence homology theory is fundamentally non- 
classical, the ‘formal form’ of a language of which the ‘informal form’ is ‘wholeness’ 
(p. 265). 

This entire house of cards collapses once it becomes evident that Bohm’s 
notion of wholeness amounts to a formulation of the measurement problem in a 
Boolean or phase space reconstruction of the quantum statistics, and that quantum 
mechanics is non-classical because it is non-Boolean, not because it is non-local 
or discrete. I see no obvious connection between discreteness and non-locality 
or wholeness in Bohm’s sense. 

The section of philosophical papers is uniformly poor, with the possible excep- 
tion of von Weizsicker’s article. Bunge proposes to ‘handle in a philosophical 
way the question: what ts QT about?’ (p. 265). He reformulates the question as: 
‘What is the reference class of QT?’ (p. 271). The problem is immediately solved: 
The reference class of the state vector for a free particle of a certain kind is 
‘the set of all (actual and possible) particles or, rather, pseudo-particles of the 
given kind’ (p. 271). For a particle in an external field, the reference class is the 
set of all ordered particle-field pairs. And so on. Bunge concludes that a ‘semantic 
analysis of QT’ shows that ‘the reference class of any quantum theory is a set of 
physical systems, or of pairs (or triples or in general n-tuples) of physical 
systems’ (p. 271), and ‘that it is unnecessary and moreover misleading to 
ascribe such systems any mental properties, even worse to fuse them with the 
subject’ (p. 271). Semantic analysis indeed! A similar analysis would show that 
horticulture is about flowers, or of pairs (or triples or in general ts of 
flowers). 


1 ‘Space-Time Geometry as an Abstraction from “Spinor”? Ordering’. 
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There is a rather silly paper by Post on the incompleteness of quantum mech- 
anics. He argues that ‘the mere fact that the theory does not go beyond proba- 
bility statements renders it incomplete in my sense’ (p. 280), because any proba- 
bilistic theory is ‘incomplete’ as Post uses the term. One might as well say that 
quantum mechanics is complete merely because the theory involves a differential 
equation of motion for the state, and any theory involving a differential 
equation of motion for the state is complete. Post remarks that the existence of 
incompatible magnitudes in the theory ‘makes a realistic interpretation of the 
quantum mechanical description of a system impossible’ (p. 277). I have pointed 
out that a realist philosophy of physics involves either rejecting the a priori 
character of Boolean logic and retaining quantum mechanics, or rejecting 
quantum mechanics (as incomplete) and retaining Boolean logic. To put it simply: 
If quantum mechanics is true, logic cannot be a priori. 

Von Weizsicker’s article apparently introduces the logical interpretation of 
Hilbert space as I have sketched it above: 


‘J. v. Neumann first proposed to compare projection operators in Hilbert 
space with propositions, and their eigenvalues 1 and o with the truth values 
“true” and “false”. ... This logic is non-classical. The corresponding 
lattice of propositions is not Boolean rather it is isomorphic with the lattice 
of the subspaces of the Hilbert-space that means with a projective geometry. 

. Probability is a fundamental concept in quantum theory Asking what 
laws quantum mechanical probabilites obey one will first ask whether they 
conform to Kolmogoroff’s axioms of probability. These axioms introduce 
probability as a real function defined on a Boolean lattice of what is called 
possible events. That events should form a Boolean logic follows from class- 
ical logic, since two events can be connected by “and ” and “or”. In quantum 
theory what is changed does not seem to be the set of ensuing axioms but 
the lattice to which they are applied; this seems to indicate a change in the 
underlying logic’ (pp. 66~7). 


Von Weizsäcker understands a change in the underlying logic as a re-inter- 
pretation of the connectives ‘and’ and ‘or’. This change is not purely conven- 
tional; rather, the underlying logic ‘is not a result of particular experience but 
a presupposition of all experience’ (p. 237). For von Weizsikcer, it makes no 
sense to talk of the logic of events; the appropriate logic for quantum mechanics 
is ‘the logic of quantum mechanical probability’, to use Suppes’ phrase. And 
this logic, he proposes, is a tense logic—its structure is non-Boolean because it 
represents the structure of time, ‘the most general presupposition of experience’ 

. 236). 

Or, is a Kantian position. For a Kantian, it is only a very slight ellipsis to talk 
about events being ‘connected by “and” and “‘or”’, for events (in the sense of 
possible experiences) have a subjective component, or rather a linguistic com- 
ponent, relating to our conceptual or linguistic framework for ordering experi- 
ence. The argument is that the logic of quantum mechanical probability ought 
to be non-Boolean, because this is the appropriate logic for contingent statements. 
Thus: ‘Quantum theory has only made us aware of logical distinctions we were 
pertnitted to neglect in classical science.’ (p. 237).* 


1 Suppes [1966], p. ao. 5 My italics, 
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I shall not discuss von Weizsicker’s argument here, because it is only sketched 
in the article. It seems to me that the structure of time, or more generally of 
space-time, is an entirely separate issue from the structure of quantum logic. I 
think it is important to see that certain questions are excluded from consider- 
ation if one adopts von Weizsicker’s view. The completeness problem, for 
example, can make no sense to a Kantian. 

Von Weizsäcker proposes ‘not to use the values “true” and “false” for state- 
ments about the future. This is, I suspect, what Aristotle had in mind when he 
wrote his chapter 9 of De Interpretatione. A statement about the future can be 
proved or disproved by inspection only when it is no longer a statement about the 
future. Statements about the future can, however, meaningfully be called 
“necessary”, ‘‘possible’’, “impossible”, “probable with probability 1”, and so on; 
“modalities”, as logicians say, can be applied to them’ (p. 240-241). Now this is 
an evasion of the problem, in the same way that Bohm’s notion of wholeness is 
an evasion of the problem. Here the structure of time is linked, via Aristotle’s sea- 
battle argument, to a bivalent semantics with truth value gaps. It is unecessary 
even to object that this argument is confused. For the mere introduction of 
truth value gaps, whether or not this is associated with contingency, will not 
alter the Boolean character of the logical space. It is possible to introduce a 
bivalent semantics for quantum logic, and it is quite obvious that this semantics 
may be replaced by a semantics with truth value gaps. This simply amounts to 
leaving the logical connectives undefined for incompatible sentences. The change 
is superficial. On the (realist) logical interpretation of Hilbert space, quantum 
mechanics is indeterministic because there are no partial Boolean valuations on 
the logical space of events, and not because statements about the future are 
neither true nor false in Aristotle’s sense 

I see the history of the development of interpretations of the quantum theory 
along the following lines: The period 1925-35 culminated in an apparent choice 
between the realist interpretation of de Broglie, Schrédinger, and Einstein, and 
the Kantian Copenhagen interpretation of Bohr and Heisenberg—with the 
Copenhagen interpretation decidedly the front-runner. Indeed, Einstein’s 
realism seemed naive and reactionary, compared to the Bohr-Heisenberg ‘tran- 
quillizing philosophy’ (Einstein’s phrase’) with its complementarity principle, 
completeness thesis, etc. Nothing was added to this debate until 1952, when 
Bohm published a hidden variable theory which for the first time clearly and in 
detail formulated the implications of Einstein’s realist view, specifically the 
incompleteness of quantum mechanics on the statistical interpretation of the 
Hilbert space. There was no further significant contribution to the problem of 
interpretation until Putnam proposed a realist interpretation of quantum mech- 
anics as a complete theory, on the basis of a logical interpretation of the Hilbert 
space. 

Now, Quantum Theory and Beyond does not really go beyond 1935 (if we 
understand Bohm’s hidden variable theory as the articulation of what was 
„implicit in Einstein’s position), as far as the interpretation problem is concerned. 
I have discussed Bohm’s contribution at some length, because this seemed to 
me the best way to bring out the issues involved, and to support my general 
criticism. d ‘ 

1 Einstein [1928]. 
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Bastin concludes his editorial with the following comment: 


‘Suppose for a minute that the complete picture of Bohr-that the classical 
language is inviolate and that discreteness mustenter physics through the 
complementarity principle—has gone. Does this mean that this, the most 
sophisticated current theory which includes an “epistemological” element in _ 
physics itself, was a mistake, and that the great physicists of the ’twenties 
were wrong in their intuition that quantum physics inescapably gave the ob- 
server a place within the theory? We shall not be able to answer this question 
until we know where the new theories that are to have a real place for the 
discrete locate that place’ (p. 11). 


I submit that the answer to this question is quite clear: No, the quantum 
theory was not a mistake; yes, the great physicists of the twenties (excluding 
Einstein, Schrédinger, and de Broglie) were wrong in their intuition. Further- 
more, this question is of no more than historical interest. As the product of an 
inquiry into the foundations of quantum mechanics, Quantum Theory and Beyond 
represents the uncritical acceptance of the relevance of problems and solutions 
characteristic of a philosophy of science popular among a certain group of 
European scientists in the twenties. With the exception of those articles which 
are of interest in themselves as expositions of mathematical theories of discrete 
space-time structures, the book will not, I think be of much value either to 
physicists or to philosophers of science with interests in this field. 


- JEFFREY BUB 
Institute for the History and Philosophy of Science, . 
University of Tel-Aviv 
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L. L. Whyte 1896—1972 | 





Lancelot Law Whyte who died on 14 September 1972, was the fourth Chairman 
of the Philosophy of Science Group of the British Society for the History of 
Science, as our Society was then called, and held office from 1953 to 1955. 
He was a Founder Member and an active supporter of the Society’s meetings 
and of the Journal, to which he made a number of stimulating contributions. 

_ Lance, as he was known to his friends, was born a son of a Scottish Presby- 
terian minister in 1896. On his mother’s side he claimed descent from King 
Robert the Bruce. He was educated at Bedales School, where he spent eight 
happy years, became Head Boy and won a scholarship to Trinity College, 
Cambridge. He should have gone up to Cambridge in October 1915, but instead 
he decided to join the army. He took part in the Battle of the Somme as an 
artillery officer and was awarded the Military Cross. He was wounded but 
returned to the front again. In 1919 he took up his scholarship at Trinity and 
read physics. In due course, he began research under Rutherford, but experi- 
mental physics did not appeal to him and he left Cambridge to travel abroad. 
He had come under the influence of C. K. Ogden who, in Whyte’s own words, 
‘brought European continental thinking to insular Britain’. 

After his return from the Continent, Whyte joined an investment bank in the 
city. His passion for new ideas led him to recognise the genius of Frank Whittle, 
inventor of the jet engine, whom he met in 1935. The following year he formed 
Power Jets Ltd. to back Whittle’s work and he succeeded in steering Whittle 
through many difficulties to the completely successful first flight in 1941. 

Meanwhile, Whyte had begun writing books. His first, published in 1927, 
was a short work called Archimedes, or the Future of Physics. It was soon followed 
by the more substantial Critique of Physics. In these two books he developed the 
speculative idea that there is a fundamental parameter of time which is a uni- 
versal physical constant like c and h. After the war he devoted himself entirely 
to writing and lecturing and went on many lecture tours in the United States. 
His interests were not confined to physics, but included general problems of 
biology and psychology. He was particularly fascinated by aspects of form and 
he edited a work with that title. In The Unitary Principle in Physics and Biology 
he argued that it is a fundamental principle of nature that asymmetry decreases 
and gives place to symmetry. 

In the later 19508 Whyte became very interested in the ideas of Boscovich, 
particularly his concept of point-like atoms. This led him to edit R. 7. Boscovich, 
1711-1787, Studies of His Life and Work and later to write The Atomic Problem. 
His other books include The Next Development in Man, The Unconscious Before 
Freud and his autobiography Focus and Diversions. In his last years Whyte 
turned his attention to evolutionary biology and the development of a hier- 
archical order in nature. He believed that in living creatures an internal organis- 
ing principle operated and that natural selection and Mendelian genetics were not 
the sole factors at work, His book Internal Factors in Evolution appeared in 1965. 
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Although it is too soon to say which of Lancelot Whyte’s many ideas will 
bear fruit, the catholicity of his interests and the unorthodoxy of his views, 
coupled with his infectious enthusiasm, acted as an intellectual stimulus. He 
was a fascinating talker and had an unrivalled gift for bringing together people 
with similar interests who, but for him, would probably never have met. Many 
of us owe him a great debt for encouraging us to look at old problems in a new 
way. He will be greatly missed. 

G. J. WHITROW 


"The British Library of Political and Economic Science 


AN APPEAL FOR FUNDS 


What is the British Library of Political 
and Economic Science? 


First and foremost it is a great 
international centre of intellectual 
activity in social studies, 

It is used not only by the academic 
staff of the London School of 
Economics and by its internationally 
famous Graduate School, but by 
senior scholars from all over Britain 

as well as other parts of the world. It 
is widely regarded as the world’s most 
outstanding library specially devoted 

to the whole range of social studies. 

A substantial part of its material is not 
in the British Museum Library, which 
it therefore complements. Its collec- 
tions are already made extensively 
available, by loans and by the 
provision of photo-copies, to university 
and other libraries throughout Great 
Britain and overseas. 

It contains well over two million items 
covering economics, commerce, 
probability theory, statistics, account- 
ing, business management, industrial 
relations, social anthropology, 
sociology and social psychology, public 
and social administration, international 
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United Kingdom and other countries, 
economic history, modern political 
history, political thought, international 
relations, geography, philosophy, 

logic and scientific method, history 
and philosophy of the natural and of 
the social sciences and language studies. 
It has valuable collections of manu- 
scripts, private papers and records of 
government administration. 

Apart from the richness of its 
collection, much of its effectiveness 
derives directly from its central 
position, near to the British Museum 
and its Library and to the cultural, 
business and administrative centres of 
the nation. It is thus readily accessible 
to all who wish to make use of its 


resources, whether they live in this 
country or come from abroad. 


What are its present needs? 


Founded in 1896, in recent years the 
Library has suffered increasingly from 
worsening conditions of access and 
storage; and this despite the extensive 
additional accommodation provided 
for it by the London School of 
Economics, which is now the sole 
trustee, at the expense of other 
activities and amenities. 

As a consequence, the effectiveness of 
the Library has tended to diminish 
and development has been increasingly 
stultified. If this state of affairs 
continues, a time will soon be reached 
when its position as a great national 
and international library will be 
seriously impaired. 

Until now, limitation of space and 
scarcity of sites have made it difficult 
to see ways of arresting such a develop- 
ment, which would certainly be a 
disaster for future work in the social 
sciences at home and abroad, and 
could indeed be regarded as a cultural 
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What can be done? 


Fortunately there has recently 
appeared a possible solution. 
Immediately adjacent to the School is a 
massive building containing some 
158,000 square feet of usable floor 
space and built to bear the weight of 
books and newspapers. It can thus be 
readily converted for library use with 
extensive accommodation for advanced 
study and the most modern equipment 
for the storage and processing of 
information. . 

The London School of Economics 
with the authorisation of the University 
of London and of the University 
Grants Committee has been able to 


conclude a contract to purchase this 
property, at a price approved by the 
District Valuer, when it is vacated by 
its present occupants. This is expected 
to occur in or shortly after December 
1973. It is to meet this undertaking, 


and the cost of converting the building 


into a new home for the British 
Library of Political and Economic 
Science, that the appeal for funds is 
being made. 


How much money is needed ? 


The cost of the building and site is 
£3,780,000. Towards this the Univer- 
sity of London, with the approval of 
the University Grants Committee, has 
undertaken to meet the cost of the 
building which is £1,980,000, provided 
that the School can pay for the site, 
valued at £1,800,000. Beyond this it is 
estimated that at least another 
£700,000 will be needed to meet the 
minimum expenses of conversion. For 
the full range of services which it is 
hoped eventually to introduce con- 
siderably more than this sum will be 
necessary. 

This means that at least £24 million 
must be raised if this project is to be 
realised. Of this, £1,800,000 must be 
in prospect by October 1973. 

A library of this sort is not just a 
collection of books: it is a centre of 
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active training and research; as 
essential for social studies as is a 
laboratory for natural science. It is in 
the belief that the extensions and 
improvements contemplated are 
essential to maintain the Library and 
the London School of Economics as a 
leading centre in the western world 
for the study and advancement of the 
social sciences, that we have launched 
this project. 

The success of this appeal therefore 
has a special interest for learned 
societies working in the fields with 
which it deals and for all who are 
concerned with the future of the Social 
Sciences, not only in Great Britain but 
also in all those parts of the world 
whose scholars and students have used 
the resources of the London School 

of Economics and the Library as 
alumni or visitors. 


The appeal organisation 


Lord Robbins, Chairman of the Court 
of Governors of the London School of 
Economics, is Chairman of the organi- 
sation which has been set up by the 
Court to manage and effect the Appeal 
which has charitable status. 

The Appeal Office is situated in the 
London School of Economics, 
Houghton Street, London WC2A 2AE, 
telephone 01-405 7686. ' 


London School of Economics & Political Science, 


Houghton Street, London WC2A 2AE 


I enclose herewith a donation of. 
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1.5 


Methodological. Preliminaries: Scientific Research Programmes; 
Three Notions of Adhocness; the Central Notion of ‘Novelty of a 
Fact’. 


. Popper, Griinbaum and Holton on Lorentz and Einstein. 


The Double Heuristic Role of Mathematics in Science: 

(a) Increase of Empirical Content through Translation into Mathe- 
matical Language and through the Physical Interpretation of 
Mathematical Entities. 

(b) An Important Illustration: the First Version of Lorentz’s 
Theory of Corresponding States Arises out of the Realistic 
Interpretation of the Lorentz Transformation whose Origins 
were Purely Mathematical. 

Lorentz Derived the Lorentz-Fitagerald Contraction Hypothesis 

from the Molecular Forces Hypothesis which is in all Senses non ad 

hoc: the Michelson-Morley Experiment Lends Dramatic Support 
to the Molecular Forces Hypothesis which Conforms to the Heuristic 
of the Ether Programme. 

The Progress of Lorentz’s Programme after 1892: 

(a) The Final Version of the Theory of Corresponding States, 
(1904). 

(b) Lorenta’s Failure to Establish the Full Covariance of Maxwell’s 
Equations. 


` * This paper is an expanded version of a talk given before the British Society for the 
Philosophy of Science on 7 December 1970. I gratefully acknowledge the valuable 
criticisms and suggestions received from Clive Kilmister, Imre Lakatos, John Stachel, 
Jobn Watkins and John Worrall—all of whom were burdened with previous versions of 
this paper. 
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(c) Potncaré’s Contribution and the ‘Observational Equivalence’ of 
' Special Relativity and the Theory of Corresponding States. 
1.6 The Rationality of Lorentz’s Pursuing his own Programme after 
905. 
2 Einstein's Progress. 
3 Einsteins Programme Supersedes Lorentz’s. 


Introduction. 

(a) The Michelson-Morley Experiment. 

Most of the answers to the question of why Einstein’s programme super- 
seded Lorentz’s refer to the weaknesses of Lorentz’s solution to the problem 
posed by Michelson’s results. I shall argue (in chapter 1) that these 
alleged weaknesses are illusory, but let me start by giving a schematic 
description in classical terms of the experiment which was first performed 
by Michelson in 1881 and then repeated with increased precision in 1887 
and after. 


D 





Fic, 1. 


Michelson used an interferometer consisting of two perpendicular arms: 
BE = L and BD = I. At B a half-silvered mirror makes an angle of 45° 
with BE. A light source A emits a beam which is divided at B into two 
rays: a reflected ray R, which travels along BD, is reflected at D, then 
goes back to B; and a Ray R, which is transmitted along BE, falls per- 
pendicularly on a mirror at Æ, then returns to B where it is partially 
reflected before interfering with R,. Suppose BE lies in the direction of 
the earth’s motion through the ether and consider a frame of reference 
fixed with respect to the earth, then on the classical account (c—-v) and 
(c+) are the speeds of R, between B and E and between E and B res- 
pectively (where: v = velocity of the. earth, and c = speed of light). 
Hence the time taken by R, to return to B is tg = L/(c—v)+L/(e+0) = 
(2L./c)B? where B = (r—v?/c*)-++. (Because of its central role in Relativity 
Theory, the coefficient (r—v/c*)-* is denoted by a special symbol.) 


i 
` 
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FIG. 2. 

Let u be the speed of R, along BD. The velocity of R, in the ether is 
v+u, from which it follows that |o-+u| = c i.e. v?-++u? = cê; therefore 
u = 4/—o?, Hence the time taken by R, to return to B is 

t, = 2ljye— v = (al/e)B 
If the arms of the interferometer are equal i.e. if L = l, then 


tat, =E pE p E pepo, 





If the apparatus is rotated through 90°, the time taken by R, to come 
back to B is increased by (2L/c)B(B—x), while the time taken by R, is 
diminished by the same quantity. The total time difference is therefore: 

3 4L 4L y2\ -i y2\— _2 Lv 
Tee- =e [(-) (a) |e 

This time difference should cause a certain shift of the inter- 
ference fringes. No such shift was observed and so Michelson claimed to 
have refuted Fresnel’s original hypothesis of an ether at rest and confirmed 
Stokes’s theory of total ether drag. However, in 1881 Michelson had 
taken the speed of R, to be c in both directions, thus obtaining a time 
difference of 4Lv*/c®, which is twice as large as the correct one. Lorentz 
pointed out this mistake in Michelson’s calculations and showed that 
the real shift fell within the limits of observational error.1 

Michelson, together with Morley, repeated the experiment in 1887; 
they increased precision by making the light rays R, and R, travel several 
times between the mirrors. Still no shift of the fringes was observed. 
Lorentz was now convinced that the Michelson-Morley result was a 
serious difficulty, not just for Fresnel’s but also for his own theory. 

1 Lorentz [1886]. There was a series of experiments performed by Micheleon and not 
just one (I sometimes refer to the Michelson—Morley experiment, meaning the one 
performed in 1887); and, as Lakatos emphasized, in the long contest between the 
theoretician Lorentz and the experimenter Michelson, Lorentz always held the upper 


hand and eventually confused Michelson to such an extent that he gave up the hope of 
even interpreting his own experiments. (Cf. Lakatos [1970], 3 (dz)). 
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(b) The Standard Accounts of the Role of the Michelson-Morley Experiment 
in Einsteins Victory over Lorentz. 


(b1) The Inductivist Account. 
Inductivists maintain that Einstein’s second postulate concerning the 
invariance of c is a valid generalisation of Michelson’s result. 

According to Max Born ‘...the second statement, that of the con- 
stancy of the velocity of light, must be regarde as being experimentally 
established with certainty.’ 1 

Reichenbach claims ‘... it would be mistaken to argue that Einstein’s 
theory gives an explanation of Michelson’s experiment, since it does not 
do so. Michelson’s experiment is simply taken over as an axiom.’ ? In- 
ductivism makes a dual claim, namely that Einstein succeeded where 
Lorentz failed. This is underlined by Kompaneyets in his (otherwise 
excellent) textbook: ‘A direct experiment was performed which showed 
that the velocity of light cannot be combined with any other velocity and, 
in all reference systems it is equal to a universal constant c. This was the 
famous Michelson experiment.’ 3 

In other words, it is alleged that the experiment established the in- 
variance of c, which entails the breakdown of the addition law of velocities; 
since the addition law follows from the Galilean transformation which 
forms part of Lorentz’s system, the latter is refuted by the same experi- 
ment. Thus inductivism claims that the Michelson-Morley experiment 
simultaneously defeated Lorentz and established a fundamental postulate 
of Einstein’s theory. 

The inductivist account fails both for logical and for historical reasons. 
From the logical point of view, it has by now become a platitude that 
observation reports neither establish nor even probabilify high-level 
theories. None of Michelson’s observational statements is equivalent to 
the proposition that in all inertial frames the speed of light is a universal 
constant independent of the velocity of the source. As far as history is 
concerned, as Michael Polanyi pointed out, ‘Michelson’s experiment had 
a negligible effect on the discovery of Relativity’.4 Shankland’s account 
supports Polanyi’s claim: ‘When I asked him [#.e. Einstein] how he had 
learned of the Michelson-Morley experiment, he told me that he had 
become aware of it through the writings of H. A. Lorentz, but only after 
1905 had it come to his attention.’ ë 
(b2) The Falstficationist Account. 

There is a falsifieationist account of the role of the Michelson-Morley 


1 Born [1962], p. 225. * Reichenbach [1958], p. 201. 
3 Kompaneyets [1962], p- 191- * Polanyi [1958], p. ro. 
Š Shankland [1963], pp. 47-8. 
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experiment which has something in common with the inductivist account, 
but which attributes Lorentz’s failure and Einstein’s success to their 
respectively ad hoc and non ad hoc responses to the experiment. On this 
account the ‘crucial experiment refuted the conjunction of Galilean 
kinematics, Newton’s laws and Maxwell’s equations for the ether. In order 
to explain away Michelson’s result, Lorentz resorted to an auxiliary 
assumption, the Lorentz-Fitzgerald Contraction Hypothesis (hereafter 
referred to as the L.F.C.), which was however ad hoc. Thus, for example 
Popper writes: ‘An example of an unsatisfactory auxiliary hypothesis 
would be the Contraction Hypothesis of Fitzgerald and Lorentz which 
had no falsifiable consequences but merely served to restore the agreement 
between theory and experiment—mainly the findings of Michelson and 
Morley.’ 1 

Now Einstein, on his own admission, was familiar with part of Lorentz’s 
work, in particular with the latter’s [1892a] and his [1895]. Thus the 
falsificationist can easily explain what prompted Einstein to propose his 
own theory. Einstein, having realised that the L.F.C. is ad hoc, proposed 
special relativity theory (hereafter referred to as the S.R.T.) as a better— 
non ad hoc—alternative. 


(63) Holton’s Account. 
In his [1969] Holton gives further support to the claim that the L.F.C. 
was ad hoc: “This saving Hilfhypothese [t.e. the L.F.C.] is introduced 
completely ad hoc... . No explicit comment is made which connects this 
assumed shrinkage with the Lorentz transformations in their still primitive 
form as published earlier in the book [Lorentz’s [1895]].’ 2 

Holton, however, attaches to ad hocness a meaning different from that 
attached to it by Popper. For Holton the L.F.C. is ad hoc because it is 
not integrated into the rest of Lorentz’s system. It is, for example, not 
connected with Lorentz’s transformation equations. 

In the first section of this paper I propose to refute all the charges of 
ad hocness which have been levelled at the L.F.C. and show that Lorentz’s 
programme progressed until after 1905. 


I The Progress of Lorentz’s Programme. 

1.1. Methodological Preliminaries. 

I shall appraise the progress of Lorentz’s programme in the terms provided 
by the methodology of research programmes.® A scientific research pro- 
gramme is characterised by a hard core and by a heuristic. The hard core 


1 Popper [1935], section 20. * Holton [1969], p. 171. 
3 Cf. Lakatos [19685], [1970], [19714] and [19718]. 


100 Ele Zahar : 


consists of assumptions which, by methodological decision, as it were, are 
kept unfalsified. Each theory in the programme is a conjunction of, on the 
one hand, the hard core and, on the other, of auxiliary hypotheses to which 
the modus tollens is directed whenever anomalies arise. A programme also 
has a heuristic which consists of a set of suggestions and hints which govern 
the construction or modification of the auxiliary hypotheses. The heuristic, 
which sets out a research policy, is less rigid than the hard-core. A good 
example of a (progressive) research programme, as I shall argue, is 
Lorentz’s ether programme. Its hard core consists of Maxwell’s equations 
for the electromagnetic field; of Newton’s laws of motion and of the 
Galilean transformation, to which Lorentz added his equation: 


F= (D+ A it) 


for the so-called Lorentz force. The heuristic of the programme arises 
from the overall metaphysical principle? that all physical phenomena are 
governed by actions transmitted by the ether. Applications of the heuristic 
to specific problems (which may or may not be set by ‘refutations’ or 
anomalies) generate a sequence of theories. We shall be mainly concerned 
with three consecutive theories belonging to Lorentz’s ether programme. 
I shall refer to them as T,, T, and T}. 

T; consists of the hard core as defined above together with the (tacit!) 
assumptions (i) that moving clocks are not retarded and (#) that material 
rods are not shortened by their motion through the ether. 

T, is obtained from T; by substituting the L.F.C. for assumption (#). 
According to the L.F.C. a body moving through the ether with velocity 3 
is shortened by the factor 4/r—v?/c*, 

T; is the conjunction of the hard core, of the L.F.C. and of the assump- 
tion, that, contrary to (z), clocks moving with velocity © are retarded by the 
factor 4/r—v*/c®.? 

I claim that both the. shift from T, to T, and that from T, to T, were 
non ad hoc. This implies in particular that the introduction of the L.F.C. 
which took Lorentz from T; to T, was not an ad hoc manoeuvre. Let me 
first clarify the various contrary claims. All the more or less vague charges 
of ad hocness, including Holton’s, which have been levelled at the L.F.C. 





1 For the connection between metaphysics and heuristic, cf. Watkins [1958]. 

* This is a slight simplification. We shall see that Lorentz deduced the L.F.C. from the 
Molecular Forces Hypothesis. However, although he used an instrumental notion of 
‘local time’, he did not realise that the M.F.H. entails the Clock Retardation Hypothesis. 
Only after Einstein published his results in 1905, did it occur to Lorentz that the 
M.F.H. implies that moving clocks are retarded by the factor v-r —v*/c’. In other words: 
before 1905 Lorentz did not give a realistic interpretation of ‘local time’. 
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are captured by means of the three notions of ad hocness used to appraise 
research programmes. 

Ad hocness in research programmes is defined not as a property of an 
isolated hypothesis but as a relation between two consecutive theories. A 
theory is said to be ad hoc, if it has no novel consequences as compared 
with its predecessor. It is ad hoc, if none of its novel predictions have 
been actually ‘verified’; for one reason or another the experiment in 
question may not have been carried out, or—much worse—an experiment 
devised to test a novel prediction may have yielded a negative result. 
Finally the theory is said to be ad hoc, if it is obtained from its predecessor 
through a modification of the auxiliary hypotheses which does not accord 
with the spirit of the heuristic of the programme. 

Since ‘ad hocness’ depends in an essential way on the notion of novelty 
of facts, this notion has to be examined in some detail. 

Before embarking on a general discussion let us consider a few concrete 
examples of novel facts. Lakatos mentions the return of Halley’s comet 
as a new fact anticipated by the Newtonian programme and, of course, I 
agree with him that the discovery of any new type of fact is the discovery 
of a novel fact. But, if we equate novelty simply with temporal novelty, 
we are driven into a paradoxical situation. We should, for example, have 
to give Einstein no credit for explaining the anomalous precession of 
Mercury’s perihelion, because it had been recorded long before General 
Relativity was proposed. Similarly, we should have to say, contrary to 

-informed opinion, that Michelson’s experiment did not confirm Special 
Relativity and Galileo’s experiments on free fall did not confirm Newton’s 
theory of gravitation. Lakatos, who does not easily dismiss the judgments 
of physicists’, is aware of this difficulty and tries to avert it by shifting 
his original view and saying that, in the light of a new theory, some known 
facts may ‘turn into’ novel.ones. For example, whereas Balmer merely 


1 Students of methodology will realise that my characterisation of the notion of ‘ad hoc,’ 
differs from that given by Lakatos. I characterise a theory as ad hoc, (at time t) if none 
of its excess content over its rivals has, at time t, been corroborated, ‘Thus, on my 
characterisation, if a theory is non ad hocs, it has predicted a novel fact, i.e. a theory is 
empirically progressive if it is non ad hoc,. Lakatos on the other hand characterises a 
theory as ad hoc; if all of its excess content has been refuted. (Cf. his [1968b].) Thus, on 
Lakatos’s characterisation, a theory can be empirically non-progressive (none of its 
novel predictions have been corroborated) and at the same time non-ad hoc, (not all of its 
novel predictions have been refuted). Apart from Lakatos’s spoiling of symmetry (I 
take over from him the definitions: non ad hoc, = theoretically progressive, non ad 
hoc, = heuristically progressive) my characterisation is clearly more in the spirit of 
-Lakatos’s enterprise than his own. He (correctly) stresses dramatic confirmation as 
opposed to refutation; my ‘ad hoc,’ is simply the negation of his notion of ‘empirically 
progressive’. (Lakatos? s own notion of ‘non ad hocr seems to be simply a rechristening 
of Popper's ‘third requirement’ (Popper [1963], p. 242). But how can we ever establish 
(even tentatively) that all of a theory’s excess content is false?) 

® Lakatos [1971a], pp. 120-2 and [19716], pp. 179-80. 
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‘observed’ that the hydrogen lines obey a certain formula, Bohr connected 
these lines with the energy levels of the electron in the hydrogen atom. 

However, Lakatos’s modified notion of ‘novel fact’ is open to the 
following fatal objection. Any theory is a set of propositions connecting 
different terms and relations. We can always define the properties of a 
physical entity like mass through the relations which ‘mass’ bears to other 
concepts and notions within a given theory. Consequently a new hypo- 
thesis will generally ascribe new meanings to old terms. For instance, any 
experimental consequence of relativity theory involving say mass, would ` 
trivially become the expression of a novel fact. Thus the fact that a steel 
ball rolling down a slope takes a certain time to reach the bottom, could 
become a novel fact when the steel ball is considered as having relativistic 
mass. This is obviously absurd. Therefore, Lakatos’s 1970 criterion for 
novelty is too liberal, while his 1968 criterion is too stringent. 

Although Michelson’s result contains a reference to ‘length’ which 
acquires a new meaning in Special Relativity, one must not claim that 
Michelson’s result has thereby been changed into a novel fact. The 
Michelson ‘result is indeed novel vis-a-vis Special Relativity, but its 
novelty does not rest on this reinterpretation of ‘length’; since the ‘crucial’ 
experiment can be described in an ‘observational’ language which, though 
theory-laden, remains unaffected by theory-change.? “The arms of the 
interferometers have equal lengths’ can for instance be replaced by “The 
extremities of the two arms can be made to coincide by placing the two 
arms alongside each other’. 

When then does a new prediction lend—if experimentally corroborated 
—genuine support to a theory, and when is such support only spurious? 
Consider the following situation. We are given a set of facts and a theory 
Tà, ...,A,] which contains an appropriate number of parameters. Very 
often the parameters can be adjusted so as to yield a theory T* which 
‘explains’ the given facts; it may even happen, given sufficiently many 
degrees of freedom, that new dramatic relations between old facts can be 
exhibited (or rather fabricated). For example, by expanding a harmonic 
function ¢ in terms of spherical harmonics and then adjusting the co- 
efficients of the expansion, the precession of Mercury’s perihelion can be 
accounted for within Newtonian physics.? In such a case we should 
certainly say that the facts provide little or no evidential support for the 


1 Lakatos’s shift from a bold purely temporal concept to a watered down concept of 
novelty becomes clear when one compares Lakatos [1968a], pp. 381-7, with Lakatgs 
[1970], pp. 155-7. The question is whether this shift is historiographically progressive or 
ad hoc on his own criterion given in his [19714] and [19718]. 

3 This was first pointed out by Reichenbach in the introdug¢tion of his posthumous 
[r965]. 

3 See Adler, Bazin and Schiffer [1965], p. 202. 
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theory, since the theory was specifically designed to deal with the facts. In 
other words, the way in which a theory is constructed is relevant for an 
appraisal of its merits, If we are given only the end-product T* which 
predicts facts a, b and c, we shall in general be unable to determine whether 
a, b and c lend genuine support to T* or whether T* was simply cleverly 
engineered to yield the known facts through an adjustment of parameters. 
This suggests the following redefinition of the notion of ‘novel fact’. 
A fact will be considered novel with respect to a given hypothesis if tt did 
not belong to the problem-situation which governed the construction of the 
hypothesis. Consider two consecutive theories T, and T, in the same 
research programme; suppose that T, faces two anomalies e, and e, and 
that T, was specifically evolved in order to account for e,; if it is then 
found that, T, also explains €» és, in contradistinction to e}, will be taken 
to provide evidential support for 7. This proposal rests on the fact that 
ingenious and imaginative scientists can always construct theories which 
account for a finite number of known facts. Of course, under this definition, 
any temporally new type of experimental result e will be novel, since 
any theory which implies e could not have been proposed in the light of 
the evidence e. Temporal novelty in a research programme is then a 
sufficient but not a necessary condition for novelty.1 A temporally new 
fact may have greater psychological impact than some known fact, but this, 
on its own, is irrelevant to the objective empirical support which it lends 
to a hypothesis.? 
My re-definition of novelty amounts to the claim that in order to assess 
the relation between theories and empirical data within a research-programme, 
one has to take into account the way in which a theory ts built and the problems 
it was designed to solve. 
This new criterion for novelty of facts also implies that the traditional 
methods of historical research are even more vital for evaluating experi- 
mental support than Lakatos had already suggested. The historian has to 
read the private correspondence of the scientist whose ideas he is studying; 
1 John Worrall offered an amusing counter-example to the idea that temporal novelty is 
necessary: a socially isolated theoretician has an idea which, he realises, as well as 
explaining certain known facts, predicts a temporally novel fact. He asks an experi- 
mentalist friend to test the prediction without bothering to explain to him how he 
arrived at the prediction. The prediction is corroborated. The experimental result is 
submitted to journal ¥, and the theoretical idea to journal ¥,. Journal F, takes two 
years longer to publish articles than does journal ¥,, and so the fact is known to the 
scientific world for two years before it receives a theoretical explanation. Temporal 
. novelty can be a mere accident. 

2 This amendment of the definition of novelty confirms Lakatos’s views about the 
methodology of research programmes as applied to itself. (Cf. Lakatos [19714], pp. 116- 
22). Progress in the methodology of research programmes consists in proposing finer 


demarcation criteria between scientific progress and scientific degeneration. This, I 
hope, is what my definition of novelty does. 
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his purpose will not be to delve into the psyche of the scientist, but to 
disentangle the heuristic reasoning which the latter used in order to 
arrive at a new theory. Let us give an example. In Newton’s time there 
was a well-known inverse square law for the intensity of light; Newton 
might have used some reasoning by analogy in order to propose that the 
gravitational ‘intensity’ is also distributed over the surface of a sphere and 
hence obeys an inverse square law; in this case Kepler’s laws would 
support gravitational theory more strongly than if Newton had used them 
as his heuristic starting point.1 


1.2 Popper, Griinbaum and Holton on Lorentz and Einstein. 


Having clarified the notion of ‘ad hocness’ and ‘novel fact’ let us now turn 
to Lorentz’s theory T,? and examine whether it is ad hoc in any one of the 
three senses explained above. 

We have seen that, in his [1935], Popper looked upon T, as ad hoc, but 
in 1959 Griinbaum showed that the Lorentz-Fitzgerald Contraction 
Hypothesis does not constitute an ad hoc modification of the ether theory 
in the sense now under discussion, since its confirmation is possible in 
an experiment different from the Michelson-Morley type.? In the Kennedy- 
Thorndike-type experiment‘ described by Griinbaum the arms of the 
interferometer have different lengths; T, predicts that the difference 
(ta—t,) between the times it takes the two rays R, and R, to return to the 
half-silvered mirror is equal to: 


af 5 pap = Fy 


This quantity is different both from the value predicted by Special 
Relativity and from the one predicted by the classical theory. ‘These two 
theories respectively yield the following values for (¢;—t,): 


2L) and gn 


1 The heuristic of certain programmes makes it very difficult for them (or even impos- 
sible) ever to achieve empirical progress. For instance Plato, through advocating the 
saving of phenomena by combinations of circular motions, condemned the Greek 
astronomical programme to degeneration; the epicycles acted as an infinite set of. 
parameters which could be adjusted so as to account for any periodic motion, after the 
latter had been observed. There was no uniform method of epicycle-construction which 
was independent of the facts and hence capable of anticipating them. Also cf. Lakatos 
and Zahar [1974]. 

Cf. above, p. 100. 3 Cf. Griinbaum [1959] and [1963]. 
Incidentally Kennedy and Thorndike thought that ‘using this null result [i.e. the null 
result of the Kennedy—Thorndike experiment] and that of the Michelson~Morley 
experiment, [they could] derive the Lorentz—Einstein transformations, which are tanta- 
mount to the relativity principle’. (Cf. Kennedy and Thorndike [1932], p. 400). 

® Cf. above, p. 96. 
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Popper accepted Griinbaum’s criticism as valid.1 Thus it is settled that 
Lorentz’s T, was not ad hoc,. It will be more difficult to decide whether 
T, was ad hoc, and/or ad hoc. 

Was T, ad hoc, in the sense that until 1905 nobody bothered to test 
its novel predictions? Later I shall show that it was not; but even had it 
been, this constitutes no damning criticism of the ether programme; it 
would mean only that its empirical progressiveness had not yet been shown. 
Ad hocness in the second sense becomes a demerit of a research programme 
only if it is a lasting feature. 

Is then T, ad hoc,? In actual scientific practice a hypothesis is intuitively 
judged to be ad hoc if it looks arbitrary or if it fits poorly into the research 
programme. Thus the introduction of a theory which is ad hoc in this 
sense destroys the organic unity of the whole nexus, since the various 
components of the resulting system are structured according to conflicting 
plans. For example, if, within the ether programme, a theory postulating 
some new instantaneous action-at-a-distance were proposed, the new 
theory would be found intuitively ad hoc. 

Ad hocness, is a good explication of this intuitive notion of ad hocness. 
To repeat, a theory is said to be ad hoc, if it conflicts with the heuristic 
of the research programme. If it could be shown that T, is ad hoc,, then 
two results would be achieved at one stroke: from a methodological point 
of view Lorentz’s programme would be shown to have had serious defects 
in 1905; from an historical point of view, it would become plausible that 
Einstein, who attributed so much importance to the criterion of ‘internal 
perfection’, was motivated to start his rival programme by the patched-up 
state of Lorentz’s T}. 

Holton seems to be claiming that Lorentz’s T, was both ad hoc, and 
ad hoc,; and that Einstein’s programme was triggered off by Einstein’s 
recognition of these two defects. I shall show however that Holton is 
wrong on all counts and that his (rather Polanyiite) methodology misleads 
him into false history. 

In his [1969] Holton writes: 

“This saving Hillfshypothese [the L.F.C.] is introduced completely ad hoc... No 
explicit comment is made which connects this assumed shrinkage with the Lorentz 
transformations in their still primitive form, as published earlier in the book... 
The contraction hypothesis when it was made was clearly and quite blatantly 
ad hoc—or, if one prefers to use the patois of the laboratory, ingeniously cooked 
up for the narrow purpose which it was to serve... 

The important point to note is that ‘ad hoc’ is not an absolute but a relativistic 
term. Postulate 1 and 2 [Einstein’s two postulates in his [1905]] may be said 
to have been introduced ad hoc with respect to the Relativity Theory of 1905 


1 Popper [1969], p. 51. 
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as a whole... But these postulates were not ad hoc with respect to the Michelson 
experiment, for they were not specifically imagined in order to account for its 
results...’ 1 


Holton makes two distinct claims. His first claim is that the Contraction 
Hypothesis (L.F.C.), which differentiates T, from T,, was not connected 
with the rest of T,, in particular with the Lorentz transformation equations, 
and thus, on my terms, the L.F.C. is ad hoc. His second claim is that the 
L.F.C. was specifically engineered in order to account for Michelson’s 
result; from this, together with the fact that for a long time the Michelson 
result was the only one which even seemed to support the L.F.C., I 
conclude that, in Holton’s view, the L.F.C. was not independently tested 
and therefore was ad hoc,. But both of Holton’s claims are false: 

(a) Lorentz deduced the L.F.C. from a deeper theory, namely from what 
I call the Molecular Forces Hypothesis (hereafter referred to as the M.F.H.) 
and which can be loosely formulated as follows: ‘Molecular forces behave 
and transform like electromagnetic forces.’ Moreover, in his deduction 
of the L.F.C., Lorentz made use of his famous transformation, as is 
clearly indicated by the following passage from his [1895]: 


‘For, if we now understand by S, and S not, as formerly, two systems of 
changed particles but two systems of molecules—the second at rest and the 
first moving with velocity v in the direction of the axis of x- between the dimen- 
sions of which the relation subsists as previously stated; and if we assume that 
in both systems the x-components of the forces are the same, while the y- 
and z-components differ from one another by the factor +/z—v*/c?, then it is 
clear that the forces in S, are in equilibrium whenever they are so in Sy... 
The displacement would naturally bring about this disposition of the molecules 
of its own accord and thus effect a shortening in the direction of motion in the 
proportion of 1 to »/r—v*/c* in accordance with the formulae given in the above- 
mentioned paragraph.’ 2? 





I further maintain that for anybody prepared to accept the assumption 
of an ether at rest, the M.F.H. is a plausible auxiliary hypothesis which 
introduces no alien elements into Lorentz’s programme. Putting it more 
objectively, the theory T, proposed by Lorentz is non ad hocs, because the 
M.F.H. is structured in accordance with the heuristic of the ether pro- 
gramme, which requires that physical phenomena be explained in terms 
of actions propagated in the ether. 

1 Holton [1969], pp. 177-181; my italics. The term ‘relativistic’ was probably a slip of 
the pen and should read ‘relative’. But I cannot make head or tail of Holton’s sentence 
‘Postulates 1 and 2 ‘may be said to have been introduced ad hoc with respect to the 
Relativity Theory of 1905 as a whole’. How can Einstein’s two postulates, which 
constitute Special Relativity Theory or at any rate are part of Relativity Theory, be 


ad hoc with respect to Relativity Theory? 
1 Einstein and others [1923], p. 7; my italics. 
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(b) Moreover, the M.F.H. arose out of considerations which had nothing 
to do with Michelson’s experiment. The M.F.H. arose out of mathematical 
considerations pertaining to the transformation properties of Maxwell’s 
equations. Hence Michelson’s null result is a novel fact relative to the 
M.F.H.! The M.F.H. is consequently non ad hoc,; it constituted both 
theoretical and empirical progress. 

Let me now briefly turn to the implications of my methodological 
theses for the historical accounts of the Einsteinian Revolution, and in 
particular for Holton’s account. My claim is that the L.F.C. is non ad hoc, 
and that Michelson’s result, far from providing an obstacle for Lorentz’s 
programme, in fact supported it. This clearly rules out all explanations 
of the genesis of S.R.T. which depend on the assumption that Einstein 
correctly realised the L.F.C. was ad hoc relative to Michelson’s result. One 
such explanation is Holton’s, even though he attributes only an indirect 
role to Michelson’s result. He alleges that Einstein was dissatisfied with 
the L.F.C. because it was blatantly ad hoc; it was ‘cooked up for the 
narrow purpose which it was to serve’. But ad hoc relative to what?! 
Obviously Holton’s claim is that the L.F.C. was ad hoc relative to 
Michelson’s result, since the ‘narrow purpose which it was to serve’ was 
precisely an explanation of the null result of the ‘crucial’ experiment. 
Holton adds that: 


‘the problem Einstein saw was not the logical status of the Contraction Hypo- 
thesis, not Michelson’s experimental result. (for it could be accommodated, 
even if not ‘ohne Weiteres’) but the inability of Lorentz’s theory to fulfil the 
criterion of ‘inner perfection’ of a theory.’ ? 


So Lorentz’s theory lost its ‘inner perfection’ on the introduction of 
the L.F.C., which was contrived for the sole purpose of explaining 
Michelson’s result. Thus, on this account, the ‘crucial’ experiment did 
play an important—if indirect—role in the genesis of S.R.T.: the search 
for an explanation of Michelson’s result compelled Lorentz to resort to an 
hypothesis whose ad hoc character provided Einstein with a good reason 
for starting his revolutionary new programme. 

One might defend this Holtonian account against my arguments by 
assuming that Einstein appraised the L.F.C. incorrectly.* Perhaps Einstein 
mistakenly regarded Lorentz’s L.F.C. as ad hoc in the sense of being 
1 Holton correctly points out that ‘a statement may be ad hoc relative to one context but 


not ad hoc relative to another’. (Holton [1969], p. 18x.) 

* Holton [1969], pp. 184-5- . 

3 No doubt Holton would argue that the fact that my methodology admits of this possi- 
bility shows the folly of trying to appraise the actions of great scientists with the help 
of explicit general criteria. Fortunately, on my account, Einstein did not make such a 
mistake. 
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engineered simply for the purpose of neutralising Michelson’s result. But 


_ this assumption is highly implausible. Einstein read the Versuch in which 


Lorentz proposed the M.F.H. and derived the L.F.C. from it. Further 
Einstein could hardly have regarded the L.F.C. as more ad hoe than his 
own light-postulate. For, first, on Einstein’s own criteria, the S.R.T., as 
presented in 1905, was far from being ‘internally perfect’. It consisted 
of two heterogeneous parts which were on totally different levels: on the 
one hand a high-level, universal covariance principle and on the other a 
so-called light postulate which was both low-level and extremely counter- 
intuitive. And secondly, whereas Lorentz explained why a moving rod 
contracts, Einstein bluntly asserted that the speed of light is an invariant, 
an assumption from which Michelson’s result trivially follows. Reichenbach 
was at least partially correct when he wrote that: 


‘it would be mistaken to argue that Einstein’s theory gives an explanation of 
Michelson’s experiment since it does not do so. Michelson’s experiment is 
simply taken over as an axiom.’ 


Reichenbach was right in the following sense: while intuitively the light- 
postulate can be regarded as a low-level generalisation of Michelson’s 
result, the result is prima facie unconnected with the M.F.H. (The fact 
that the light postulate is both low-level and counterintuitive was recog- 
nized by many of Einstein’s contemporaries and—understandably—gave 
rise to the myth that Einstein was a positivist who unquestioningly obeyed 
the dictates of experience.) 

My view that Einstein could hardly have judged Lorentz’s M.F.H. as 
more ad hoc than his own light postulate is supported by the following fact. 
The proposition that in all inertial frames the measured speed of light 
must be equal to the same constant c is deducible from Lorentz’s pre-1g05 


` system, which includes the M.F.H+ In other words, Lorentz’s theory 


explains not only Michelson s null result but also the invariance of c. Whatever 
meaning is attached to ‘ad hoc relative to a context’, it cannot allow that 
the M.F.H. should both imply the light postulate and, unlike the light 
postulate, be ad hoc relative to Michelson’s experiment. Lorentz was 
justified in asserting that: 


‘, . . the chief difference [is] that Einstein simply postulates what we have deduced, 
with some difficulty and not altogether satisfactorily, from the fundamental 
equations of the electromagnetic field.’ 2 


1 Admittedly Lorentz performed the deduction only in 1909 (cf. below, p. 121). However, 
I am enough of a Polanyiite to find it implausible that Einstein failed to realise, prior 
to 1905, that, from an fniuitive point of view, the M.F.H. cannot be more ad hoc than his 
‘own light postulate. g 

2 Cf. Lorentz [1909], p. 230; my italics. 
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Einstein’s programme eventually proved superior to Lorentz’s in a 
strictly objective sense,’ but this superiority does not rest on the ad hoc 
character of Lorentz’s system. 


1.3 The Double Heuristic Role of Mathematics in Science. 


I earlier claimed that the M.F.H. had its origins in mathematical con- 
siderations. Before substantiating this claim, I shall examine in general 
terms the heuristic role which mathematics can play in the development 
of scientific research programmes. 

It is well known that science has stimulated the development of mathe- 
matics. Physics sets problems for which an urgent mathematical solution 
is required; as a result, certain branches of pure mathematics receive a 
powerful impetus. For example, Newton invented the Calculus specifically 
for the study of continuous and differentiable motion: the fluent variable 
was time and the fluxion, instantaneous velocity. Thus analysis, the 
theory which dominated pure mathematical thinking for over two centuries, 
owes its origin to physics. The study of differential equations and the 
development of what later came to be called the Advanced Calculus were 
also closely connected with the development of the Newtonian programme 
in the 18th century. A similar process took place in the 19th century when 
Faraday, using ‘line of force’ as a new physical concept, enunciated laws 
of which Maxwell later gave a mathematical formulation; this, together 
with hydrodynamics, contributed to the development of vector analysis. 

These examples illustrate the heuristic function of physics with regard 
to mathematics. But what about the reverse process, namely the heuristic 
function of mathematics with regard to physics? 


(a) Increase of Empirical Content through Translation into Mathematical 
Language and through the Physical Interpretation of Mathematical Entities. 

There are two important ways in which mathematics furthers physical 
discovery. 

The scientist may start from an intuitive physical principle. Through 
being ‘translated’ into one of the mathematical languages available at the 
time, the principle may be modified; in particular it may acquire additional 
structure and thus become a stronger physical assumption. For example, 
Fresnel set out to give a mathematical formulation of his conjecture that 
light is a wave process in the ether. He instinctively resorted to the 
periodic function with which he was most familiar, namely the sine 
function. His original-assumption that light is a wave phenomenon wa’ 


1 Į shall argue this at length in sections 2 and 3. 
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obviously weaker than the hypothesis he actually used, namely that the 
wave is representable by the function sin (aat/T).2 

Peierls gives another example which beautifully illustrates my thesis. 
Peierls gives an account of the discovery of Maxwell’s equations which 
runs as follows.? 

Maxwell translated Faraday’s intuitive physics into the theory of partial 
differential equations. The following relations summarise what was known 
about the electromagnetic field in Maxwell’s time: 


> > T ôB 
(2) V.B=o (QVAH=*7 


Through taking the divergences of both sides of the fourth equation, 
we obtain o = V.j. This contradicts the law of conservation of charge 
which is expressed by the following equation: 


71 Op 
(5) Vejtae = 


Instead of altogether rejecting either his own mathematical approach 
or Faraday’s physical theory, Maxwell saw that he could restore the 
consistency of equations (r)(5) by adding the extra term (r/c\(aD| ot) 
to the right-hand side of (4). In this way he obtained a new theory con- 
sisting of (z)-(3) and: 


The new theory implies (5). 

Thus, if this account is correct, Maxwell directly tampered with the 
mathematical form of the equations rather than trying first to modify 
Faraday’s system and only then translating it into a new and hopefully 
consistent mathematical form. Of course, through altering the mathema- 
tical expression of the theory, Maxwell also modified its physical content. 
But mathematical considerations led the way. 

There is a second way in which mathematics can play a fundamental 
role in physical discovery. The usual method in theoretical physics is to 
1 In his [1913] Mach wrote: “The sine form recommends itself on account of its simplicity, 

and the simplicity of the mechanical hypotheses which suffice for its explanation seemed 

to Fresnel to warrant such an assumption’ (p. 216). 

2 Peierls admits that he is giving a reconstruction for which he has no historical pioot, 

Nevertheless he is fairly sure that the argument he describes ‘was in fact, explicitly or 

«implicitly, part of his [i.e. Maxvwell’s] reasoning’ (Peierls [1963], p. 31). I am indebted 


to Lakatos and Worrall who drew my attention to these examples supporting my thesis. 
Also cf. Worrall [1974]. 
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give mathematical expression to some physical hypothesis and then to 
use logico-mathematical techniques in order to draw consequences from 
the hypothesis. In doing so the physicist may have recourse to a number 
of mathematical operations; these operations are sometimes in the nature 
of ‘tricks’ or ‘gimmicks’ which may be needed to make the deduction 
possible. Duhem pointed out that it would be foolish to insist on giving 
a physical interpretation to all mathematical quantities and operations 
used in a scientific theory.1 Duhem is evidently right: adding lengths 
corresponds to placing physical rods one after the other; multiplying 
lengths corresponds to the construction of rectangular areas; but multi- 
plying the time ¢ by »/—z, although useful in a pragmatic sense, does 
not seem susceptible of physical interpretation. However, through trying 
to find a realistic interpretation of certain mathematical entities which 
appear at first sight to be devoid of any physical meaning, the scientist 
may be led to a new physical conjecture. We shall see that Lorentz intro- 
duced his famous transformation as a mathematical tool for solving a 
certain differential equation.? Through interpreting the transformation 
as representing a physical dilatation of coordinates, Lorentz was led to the 
L.F.C. or rather to a theory about molecular forces, the M.F.H., from 
which the L.F.C. follows. Similarly Dirac proposed a relativistic equation 
which was found to possess negative energy solutions. Prima facie such 
solutions cannot be physically interpreted. Through insisting on inter- 
preting the negative solutions, Dirac predicted the existence of the 
positron: the absence of an electron of charge —e and energy —E was 
interpreted as the presence of a positron, that is, of a particle of charge 
+e and energy +E. 

This dual heuristic role of mathematics will be apparent in the develop- 
ment of Einstein’s programme*; Lorentz’s programme, to which we now 
return, provides a fine example of the second role of mathematics. 


(b) An Important Illustration: the First Version of Lorentz’s Theory of 
Corresponding States Arises out of the Realistic Interpretation of the Lorentz 
Transformation whose Origins were Purely Mathematical. 


In Lorentz’s [18924] there is no mention of the ‘crucial’ experiment first 
performed by Michelson in 1881, then repeated by Michelson and Morley 
in 1887. This should not surprise us once we have realised that the Lorentz 
transformation was originally used as a mathematical device of which 
Lorentz gave no physical interpretation, in much the same way as we 


1 Duhem [1906], Part 2, Chapter 1. 2 Cf. below, p. 112. 
3 Cf. below, sections 2 and 3. 
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might use the expression ict without attaching any physical meaning to the 
multiplication of the distance ct by +/—1. 

Lorentz assumed the existence of the ether and of small particles, the 
electrons, which possess both material mass and electric charge. The 
electrons and their motion through the medium generate the field. Using 
Maxwell’s equations for a frame fixed in the ether, one should be able to 
determine the electromagnetic field from the charge, position and state of 
motion of the electrons; that is from the electrical density and velocity 
distributions. Lorentz was thus led to write down the differential equation 
D: [c*V?—(6?/0t*)] f = G (fis the unknown and G is a known function of 
x, Y, 2, £), whose solution constituted a purely mathematical problem. This 
solution is valid only in a coordinate system which remains at rest in the 
medium. Since the Earth is presumably moving through the ether and 
since we carry out our measurements relatively to the Earth, Lorentz was 
led to consider the field equations in a frame of reference attached to a 
moving body. He thus obtained relations which are more complicated 
than Maxwell’s equations. Again the problem of computing the field 
quantities, given the charge and velocity distributions, forced itself on 
him. The differential equation D*: [c?V?—(0/dt—v(@/dx))*]f = G, which 
is the mathematical formulation of this problem, is more complicated 
than D. The Loretnz transformation was designed to reduce D* to the 
form of D, so that any solution of D automatically yields a solution of D*. 
The transformation equations are as follows: 


x = Bx = B(x ot), y = Y= Yy Z =F = By 
l t = t—of*x/c? = pt; — oxe); 


where 8 = (1—v?/c*)-4; x, y, z, t, are the Galilean coordinates in the 
moving frame. These equations, which carry the operator [c?Vj— 0?/ 027] 
into [c*V’*— B*9*/dt'*],2 constitute the classical Lorentz transformation to 
within the extra factor £ in the expression of 2’. The origins of the Lorentz 
transformation were thus strictly mathematical, and had nothing to do 
with the Michelson-Morley experiment. 

However, soon after writing his [1892a], Lorentz realised that the 
transformation equations lent themselves to an interpretation which he 
immediately set out in his [18925] and which he then expounded in greater 
detail in his [1895]. 

1 V1 = Px 2/1, dz) and V’ = dx, d/dy’, 0/22’). 

3 One may wonder why Lorentz did not put t’ = ft” so as to obtain the full invariance 
of the operator [c*vy2— 07/027]. It is clear that he was not at this stage interested in 
invariance as such but in a means of solving a particular mathematical problem. For an 


approach to Relativity Theory based on the invariance of [c*y23— 0*/02*] cf. Stephenson 
and Kilmister [1958], Chapter 1. 
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Let us take a look at the equations: 
x’ = Bx, y = y, 8 = z, = t—vBtx/c? 


where B = (1—v*/c*)-+ and x, y, z, t, are the Galilean coordinates in the 
moving frame S. x’ is simply obtained by multiplying x by the factor B 
which is greater than r. The variable z’ is more difficult to interpret 
physically because it involves both the absolute time and the position. 
Fortunately, by considering a system of particles at rest in S, i.e. particles 
all moving with the same velocity © through the ether, Lorentz was able 
to simplify his problem. Now the field depends only on x, y, z, or alter- 
natively on x’, y’, g’; so the ‘local’ time ft’ can be safely ignored. To the 
moving system S, Lorentz made correspond a system S’ at rest in the 
medium; S’ is obtained by expanding S by the factor B along the x-axis, 
while keeping the other two dimensions and the charge unaltered. Con- 
versely, S is a contracted image of S’, so the connection between this 
physical interpretation and the Contraction Hypothesis, which Lorentz 
later inferred, becomes obvious. 

The next step was to calculate the forces acting at corresponding points 
of S and S”. Lorentz found that, if 


= (Fy Fy Fa) and F = (Fi, Fs Fi) 
denote me forces per unit charge in S and S’ respectively, then: 


= (1,1/f,7/B)F, that is, Fy = Fi, Fa = F4/B, Fs = Fs/B. 

These Pave play a fundamental role in the Theory of Corresponding 
States. Since each component of one force is proportional to the corres- 
ponding component of the other, the vanishing of one coe the forces entails 
that of the other. 

It is not surprising that the first step towards interpreting the Lorentz- 
transformation should have been taken in electrostatics, where the time 
variable can be ignored. It is well-known that the time-coordinate and 


1 The relation F = (r, 1/8, z} AF’ follows from Planck’s equation for the relativistic force 
a d/dt(m, fv), where the particle is instantaneously at rest in the moving frame. 


= (1, 1/8, BF contradicts Einstein’s equation F= m, (Ba, Bas, Bay), where 
ris as, a3) = @ = acceleration. This agreement between Lorentz and Planck stems 
from the fact ie both of them take the Lorentz-force as their paradigm of force. 
Today it is Planck’s, and not Einstein’s, equation which is generally accepted. In this 
sense Lorentz, was ahead of Einstein. (Cf. Einstein [1905] and Planck [1906].) Professor 
Clive Kilmister pointed out to me that, in his [1905], Einstein makes a Lorentzian type 
of assumption about the properties common to all forces. Einstein writes: ‘.. . these 
‘results as to the mass are also valid for a ponderable material point, because a ponderable 
material point can be made into an electron (in our sense of the wérd) by the addition of 
an electric charge, no matter how small’ (cf. Einstein and others [1923]: p. 63). Is this 
not Lorentz’s assumption again? For, since one is allowed to have transverse and longi- 
tudinal masses, one might just as well expect different masses for electric and other 
forces, 
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more generally the kinematical aspect of the whole problem caused Lorentz 
great difficulties and these were settled only by Poincaré and Einstein in 
1905. 

I have already mentioned that this interpretation of the transformation 
equations was first explained at length in the Versuch of 1895; but Lorentz 
had already used it in 1892 in order to derive the Contraction Hypothesis 
and, as a by-product, to account for the null result of Michelson’s experi- 
ment.! 

Both the transformation equations and the Contraction Hypothesis 
were proposed in 1892. In 1899, after developing a large part of his Theory 
of Corresponding States, Lorentz himself described the starting point of 
his investigations as follows: ‘In the preceding investigations, I have 
assumed that all electrical and optical phenomena in ponderable bodies 
are produced by small charged particles (electrons).’ He admitted that 
in the course of his investigations, ‘Certain mathematical artifices have 
permitted me to arrive, by a concise argument, at conclusions to which, without 
these artifices, I should not have arrived except by considerably lengthier 
developments’ 2 


1.4 Lorentz Derived the L.F.C. from the M.F.H. which is, in all Senses, 
non ad hoc: the Michelson-Morley Experiment Lends Dramatic Support 
to the M.F.H., which Conforms to the Heuristic of the Ether Programme. 


In 1892 Lorentz put forward the M.F.H., to which he was led, as we 
have just seen, by the coordinate transformation used in his [18924]. The 
M.F-.H. asserts that, in passing from the stationary system S” to the moving 
system S, the molecular forces transform like the electrostatic ones; in 
other words the stationary and moving molecular forces are also connected 
by the equation: F= (x,x1/B,1/B)F". I shall now reconstruct Lorentz’s 
deduction of the L.F.C. from the M.F.H., using the extra assumption U 
that the equilibrium configuration of a system of particles is unique.® 

Let O'A' = L’ be the length of the rod in the stationary system S’. 
Let OA = L by the length of the same rod in the moving system S 
obtained by imparting to S’ a uniform rectilinear motion with speed v 
along O'X’. We leave open the question of whether or not L = L’. Let 
O’B’ = BL; i.e. OB’ is obtained by expanding the rod OA by the factor 
B, while the charge and mass of corresponding elements remain the same. 
Let G be the sum of all the forces, molecular as well as electromagnetic, 
1 Lorentz [1892)]. 


2 Lorentz [1899], (Collected Papers, $, p. 139); my translation and my italics. 
3 Lorentz used U in 1892, but only in 1904 did he refer to it as an independent hypothesis. 
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acting at some point P of OA and let G be the force exerted at the corres- 
ponding P’ of O'B'. Since P is in equilibrium G = o. The M.F.H. 
implies G= (1,2/B,1/B)G". Hence G” also vanishes and P’ is in equili- 

` brium. Since P’ can be an arbitrary point of O’B’, the whole of the rod 
O’B’ must be in equilibrium. The same holds for O'A’, so by the unique- 
ness hypothesis U, O’B’ = O'A’; i.e. BL = L’ or L = (1/B)L'; O'A’ is 
therefore contracted by the factor r/B. 

Applying this result to Michelson’s experiment! we find that, if 
BD = L, then BE equals not L but L/B. Hence t,—t, = (2L/Bc)B*#— 
(2L/c)B = 0; so no shift of the fringes will occur. 

Thus we see that Michelson’s experiment did not refute the conjunction 
of Newton’s Laws and Maxwell’s equations. The central feature of this 
development from the point of view of my approach is that the M.F.H. 
did not result from a consideration of the experimental result. Admittedly 
Lorentz knew of Michelson and Morley’s result from 1887 onwards and 
confessed that it had been worrying him for some time}; but only in 1892, 
after finding his mathematical transformation equations, did he think of 
putting forward the M.F.H. In 1887 he might have simply postulated 
that one dimension of a body contracts in the direction of its motion 
through the ether, but he would have considered such a simplistic con- 
traction hypothesis unacceptable.? Thus, since the discovery of the M.F.H. 
was independent of the Michelson-Morley experiment, whose null-result 
the M.F.H. implies, the experiment strongly supports the hypothesis. 
The Michelson result is, according to my amended definition of novelty‘, 
1 Cf. above, p. 96. 

2 ‘The experiment has been puzzling me for some time’ (Lorentz [189aa] in Collected 

Papers, 4, p. 221). 

3 In his [1969] Schaffner pdinted out that the L.F.C. is not as simple-minded a hypothesis 


as it is generally taken to be. Nonetheless he still regards the L.F.C. as ad hoc (p. 500). 
“ Cf. above, p. 103. ; 





116 Elke Zahar 


a novel or unexpected prediction from the M.F.H. This hypothesis is con- 
sequently non ad hocs. 

Did the M.F.H. fit badly into Lorentz’s theoretical system? Put more 
objectively, was the M.F.H. ad hoc,? Lorentz explained why it was not: 
‘Surprising as this hypothesis [#.e. the L.F.C.] may appear at first sight, yet we 
shall have to admit that it is by no means far-fetched, as soon as we assume 
that molecular forces are also transmitted through the ether, like the electric 
and magnetic forces of which we are able at the present time to make the assump- 
tion definitely. If they are so transmitted, the translation will very probably 
affect the action between two molecules or atoms, in a manner resembling 
the attraction or repulsion between charged particles.’ 1 


The M.F-.H. is therefore non ad hoc, within the ether programme, whose 
heuristic requires that physical phenomena be explained in terms of 
contiguous actions through the medium. Molecular forces determining 
the shape of a given body are transmitted by the same medium as the 
electromagnetic field; since both types of force are states of the same sub- 
stratum, why should they not behave and transform in the same way? 


1.5 The Progress of Lorentz’s Programme after 1892. 
(a) The Final Version of the Theory of Corresponding States. 

The last version of Lorentz’s Theory of Corresponding States appeared 
in his [1904] which had the revealing title Electromagnetic Phenomena in a 
System Moving with any Velocity less than that of Light. 

The philosophical significance of the Theory of Corresponding States 
is that it could, as Poincaré showed, easily be turned into a theory observa- 
tionally equivalent to Special Relativity. Thus, in order to compare the 
merits of the two rival theories anno 1905, non-empirical criteria will 
have to be invoked. I shall suggest criteria which take into account the 
heuristic power of competing research programmes.® 

Let me briefly outline the task‘ Lorentz set himself after 1895. This 
will clarify the philosophical problem mentioned above by showing in 
what way the Theory of Corresponding States can be regarded as equiva- 
lent to Special Relativity. 

Maxwell’s equations, we recall, hold in a frame of reference fixed in the 
ether. Consider a moving system S and an observer O carried alorig by 
the motion of S through the ether. If the instruments of O were unaffected 
by this motion, then the measured lengths, time intervals and field inten- 
sities would bear to one another relations more complicated than Maxwell’s 
1 Einstein and others [1923], p. 5. 

2 Cf. Poincaré [1905] and [1906]; also cf. Ehrenfest [1915]. 


3 Cf. section 2 and 3. 
_ £ Ihave used the word ‘task’ on purpose, because Lorentz did not fully achieve his aims. 
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equations: in particular the measured velocity of light in S would not be 
the same in all directions and the observer would thus realise that he was 
moving relatively to the medium. The Theory of Corresponding States 
asserts that the instruments are distorted in such a way that the measured 
quantities (i.e. t’, x’, y’, 2’, E’, H’, p’) do satisfy Maxwell’s equations; 
hence the measured velocity of light is constant in S. The observer will 
thus imagine himself to be within a system S’ at rest in the ether, a 
system in which the real coordinates are the quantities t’, x’, y’, 2’ 
measured by the distorted instruments. The ‘fictitious’ system S’ is 
called the state corresponding to S.1 

Put in more technical language, Lorentz’s project for a Theory of 
Corresponding States consisted of the following stages: 

(1) Consider a frame S at rest and a frame S moving at constant velocity 
V= (v, 0,0) through the ether. In S: X= x—ot, y=y,2=—2,i=t 
(Galilean transformation). The components (D,, Da, Da) and (H,, Ha Hs) 
of the electric and magnetic fields are the same in S and S. However, in 
S the relations connecting D, H, x,y, z, are more complicated than 
Maxwell’s equations, which only hold in S. 

(2) The first problem is to determine in the frame S ‘effective’ variables 
t’, x’, y’, 2’ and D’, A’, p’ in such a way that with respect to t’, x’, y’, 2’, 
the field quantities D’, H’, and p’ satisfy Maxwell’s equations. 

(3) Construct a system S” fixed in the ether and in which the real 
variables are the ‘effective’ variables of S. In other words, using Einstein’s 
terminology, we consider the following correspondences between events: 

(t, %, Y, 3) > (t, x, Y, z) > (t, x,y’, z’) 
in S in S in S’ 

To the moving system S corresponds the immobile system S”. Since p’, 
D and H satisfy Maxwell’s equations relatively to t’, x’, y’, 2’, it follows 
that D’ and H’ are respectively the ‘real’ electric and the ‘real’ magnetic 
field in the ‘fictitious’ system S”. 

(4) Examine hypotheses (such as the M.F.H. and U) which imply that, 
if the system S” is set in motion with constant velocity v, it will rearrange 
itself so as to produce the system S. One object of the exercise is now to 
show that the results of certain types of experiment, or possibly of any 
experiment whatever, are not affected by the motion of the earth through 


1 This informal descriptioh of Lorentz’s theory is not to be taken as literally presupposing 
the presence of a conscious observer. The ‘observer’ is introduced for purposes of 
exposition and could well be replaced by a set of measuring instruments, 5 
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the ether. This is generally achieved by showing that, if certain quantities 
P, F, . . . vanish in S’, then the corresponding quantities will also vanish S. 


(b) Lorentz’s Failure to Establish the Full Covariance of Maxwells Equa- 
tions. 

Neither in his [1904], nor in his [1909], did Lorentz completely achieve 
all these aims, except in the case of electrostatics. In all other cases he 
neglected second-order quantities.1 Further, and relatedly, a great difficulty 
was posed by the problem of simultaneity: two events simultaneous in S- 
may nonetheless occur in different ‘local times’ (‘effective times’). Let 
me clarify these remarks. In his [1904] Lorentz writes Maxwell’s equations 
as follows: 


Ve Se (iii) Va =i(P +e) 
moat = oH 

. = pes tes 
(ii) V.H=o (iv) Va rar 


where p is the density and ¥ = (vı, vg, v1) is the velocity vector at time t 
at the point (x, y, 2). 

In a frame of reference moving with velocity v = (v, o, 0) through the 
ether, the ‘effective’ coordinates are given by the Lorentz transformation 
equations as we know them today.? 


(1) 2 = Pleo); y = 3 =m = (iSe) 


The vectors H’ = (H}, Hj, Hi) and D’ = (Dj, Dj, Dj) are defined as 
follows: 


(2) Di = Dı; D;= p(D:—? H); D; = (D+? ,) 


Hi= Hy H= pH? D); Hs = BHED 


Let us note in passing that equations (2) are identical with Einstein’s 
transformation equations for the electric and magnetic fields. 
Under the transformations (1) and (2), Maxwell’s equations become: 


(i’) vD = (ie üi) v nE E(D) 


1 Cf. Einstein and others [1923], pp. 14-21. Also Lorentz [1909], p. 203. 
1 Lorentz interposes the Galilean coordinates: x = x—vt, y = y, 3 = 2, t = t between 
(t, x, Y, 3) and (t, x’, 9’, 2’). 
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where: (3) #’ = (6%»,—v), Bva Bva) and p’ = p|. 

Except for the interpretation of p’ and W’, equations (i’){iv’) are 
_ exactly those obtained by Einstein in his [1905]. 
Although Lorentz’s equation V’. D = (1—vuj/c*)p’ is correct, the 
transform of the charge density is not p’ but o’ = (1—vuj/c*)p’ 
In fact: o’ = (1/B)[1—ouj/c¥]p = (1/B)[1—(@P*/e*)(¥:—2) |p = 

B(1—vv,/c)p * 

Here Lorentz seems to have made an easily corrigible mistake. Of course, 
in terms of the Galilean coordinates x,y,z, we have: x’ = Bx, y = 
Y, 8’ = 3, from which it appears to follow that dx’ dy’ ds’ = Bdx dy dz 
and so p’ = transformed density = p/f. But so to interpret p’ is to forget 
that in general p depends on the time, so that p’ = p/P holds only for an 
electrostatic system. In order to determine the density o’ in S’ at the 
space-time point (t’, x’, y’, 2’) we consider an infinitesimal volume dV’ 
enclosing the point (x’, y’, 2’) and count the charged particles lying within 
dV’ all at the same time t'. We therefore have to take into account a number 
of events which, while being simultaneous in S’, are not necessarily 
simultaneous in S. Lorentz’s mistake in the transformation equation of p 
- dis therefore deeply significant. It stems from the difficulties presented by the 
physical interpretation of local time. In 1904. Lorentz had not yet realised 
that the local or effective time ¢’ was in fact the time measured by a moving’ 
clock synchronised in accordance with Einstein’s convention. Yet the 
equations obtained by Lorentz in 1904 are so similar to Maxwell’s that 
one wonders why Lorentz did not simply postulate: 

o’ = (1—v. uj{/c*)p’ = transformed density 

and thus obtained the full invariance of Maxwell’s equations. However, 
had he taken this step, he would still have had to face the problem posed 
by equation (i:i), which now becomes: 


+. Beat aD’, 3 __ vui 
VAH =i [5t /(: oY] 


He would have had to interpret w’/(1—vuj/c*) as an effective velocity, 
and for this he needed to have developed a general kinematical framework. 
This is where Einstein's approach, which starts with general kinematical 
considerations, proved far superior to Lorentz’s. Whereas’ Einstein sorts out 
his kinematics before imposing the condition of Lorentz-covariance on all 

1 This is precisely the value given by Einstein tô the transformed density. 
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physical laws and in particular on electromagnetic theory, Lorentz painfully 
struggles to arrive at a new kinematics via electromagnetism 


(c) Poincaré’s Contribution and the ‘Observational Equivalence’ of Special 

Relativity and the Theory of Corresponding States. 

Using Lorentzian methods, Poincaré established the full covariance of 
Maxwells equations by determining the correct transformation rule for 
the density as follows.? Consider an electron whose charge is e, moving 
with velocity (£, 7, ¢) in the ether. The equation of a sphere of radius r 
centred on the electron is: 

(x— Et)?-+-(y—nt)*+-(2—£2)? = r? 

Using the effective coordinates, we obtain: 

x = B(x’ +o), y = y', z = 2’, t = Bt +(0/c*)x’) 

Hence: 

[B(x'+-ot’)— ERE Hox PNH np Hox e) 

; [2'—CA(t’+-vx’' e] = re. 

Consider this equation for some given time tó, say, tj = 0 

[B*(1— oe) [y’—nBox’ e] tbo [cf]? = r 

This represents an ellipsoid whose volume is: 

$079 /B(1—£0/c*) 
By the law of conservation of charge: 
e = p. $r? = p' . $rr®/B(1—£0/c*) 

Hence : 

‘== pB(1—€v/c*) 

This is exactly Einstein’s transformation equation for p. 

Poincaré considers the electron as a compressible body actually flattened 
by pressure exerted on its outer surface. The contraction along the 
direction of motion is not brought about, as Einstein was later to maintain, 
by the peculiar structure of space-time, but by a real physical force. This 
is why I said that Poincaré used Lorentzian methods.® 

The connection between Special Relativity and the Theory of Corres- 
ponding States, as amended by Poincaré, can be described as follows.4 
1 Cf. below, sections 2 and 3. 

2 Poincaré [1905] and [1906]. Also cf. Kilmister [1970], p. 145. 

3 But despite this, one hesitates to count Poincaré among classical physicists such as 
Maxwell and Lorentz. Poincaré was the firat scientist to recognise the group character 
of the transformation equations and probably also the first clearly to enunciate a physical 
principle of relativity. (Einstein is supposed to have carefully read ‘Science and 
Hypothesis’ before 1905.) Whatever the case may be, Poincaré showed, using Lorentz’s 
own approach, how the Theory of Corresponding States could be made observationally 


equivalent to Special Relativity. 
4 Cf. above, pp. 116-118. 
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For Einstein the ‘effective’ variables t’, x’, y’, 2’ are the ‘real’ coordinates 
in S. The equations x’ = A(x—vt), y =y, 2’ = 2, t = B(t—ox/c*), 
B = (1—v*/c*)-t, which were discovered by Lorentz, were used in 
Einstein’s [1905] to relate the coordinates in the two inertial frames S 
and S. Einstein completely abolished the system .S’.! Lorentz realised 
after 1905 that his ‘effective’ variables are in fact the measured lengths, 
time intervals and field intensities in the moving frame S. Because his 
rods are shortened, an experimenter in S obtains as a measure of the 
x-coordinate not x = x-vt but x’ = Px = B(x-vt); moreover his clocks 
are-retarded by the factor (1/8); if he adopts Einstein’s convention for 
clock-synchronisation, he obtains as the measure of time in S not £ but 
U = B(t—ox]c*). 

It can be easily verified that [(dx/dt)*+(dy/dt)*+(dz/dt)? = c°] holds 
if and only if [(dx'/dt’)*+-(dy’ /dt’)*+-(dz’ /dt’* = c?) holds. Thus, in 
Lorentz’s system, the measured velocity of light is the same in all inertial 
frames. 

Einstein differs from Lorentz in that he ragards the ‘effective’ variables 
in S as the real ones and totally abolishes the Galilean transformation, i.e. 
the mapping (t, æ, y, z) > (i, x, y, z) The Theory of Corresponding 
States is ‘observationally equivalent’ to Special Relativity because experi- 
mental results involve only measured, that is, ‘effective’, quantities. Since 
the latter satisfy Maxwell’s equations, we are unable, whether we adopt 
Lorentz’s or Einstein’s theory, to decide on empirical grounds whether 
our frame of reference is in motion or at rest in the ‘ether’. 


1.6 The Rationality of Lorentz’s Pursuing his own Programme after 1905. 
The subsequent success of Relativity Theory can easily give one the 
impression that Lorentz was wrong-headed, not to say crankish, in not 
immediately accepting Einstein’s ideas, that he was too slow to see the 
light. But Lorentz was the champion of the classical electromagnetic 
programme started by Faraday and articulated by Maxwell. Needless to 
say, the mechanics of this programme was borrowed from Newton. Its 
hard core consisted of Newton’s three laws of motion and Maxwell’s 
equations. Newton had a classical Principle of Relativity, according to 
which the laws of mechanics are the same in any two frames moving 
with uniform velocity relatively to each other. However, Newton assumed 
the existence of Absolute Space, that is of a preferred frame, although the 
latter could not be determined on the basis of mechanics alone. The 
Absolute Space Hypothesis was an idle component of Newtonian Theory 
in the sense that any inertial frame could be considered as immobile in 


1 More precisely, Einstein identified S’ with S. 


e 
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Absolute Space. With the wave theory of light, which seemed to pre- 
suppose a medium of transmission, arose the possibility of turning the 
Absolute Space Hypothesis into an empirically testable theory. Lorentz’s 
ontology consisted of an infinite immobile ether in which charge was 
continuously distributed. The electrons were spherical regions of the 
ether where the charge and possibly the mass densities differ from zero. 
The total amount of charge remains constant, but the movement of the 
electrons creates a field which travels in free space (the ether) at a constant 
finite speed c. Lorentz tentatively assumed that the electron possesses no 
material mass; the electron engenders a field which acts back on the 
source and decelerates its motion; this capacity for resisting change of 
motion is the electromagnetic mass which varies with the speed and 
accounts for the total inertia of the particle. Thus we see how fundamental 
is the role played by charge, Absolute Time and Absolute Space (i.e. the 
ether) in Lorentz’s approach. They are the ultimate constituents of a 


physical world closely resembling Newton’s and which we may call the -: 


classical world. ; 

Thus, for Lorentz to have switched to the Relativity Programme would 
have involved a major change in his metaphysical outlook. But why should 
he have made the change? If Einstein’s theory had immediately thrown 
up new facts which Lorentz’s system either could not account for, or 
could only account for in an ad hoc way, then Lorentz’s adherence to the 
classical ontology could not be characterised as rational. But, on the 
contrary, Lorentz’s own approach, based on his classical ontology, enabled 
him to make theoretical and empirical progress—often in advance of 
Einstein. For example, Lorentz explained Michelson’s result in a non 
ad hoc way; he was first to discover the transformation laws for the electro- 
magnetic field; he described the way in which the inertia of the electron 
depends both on its energy and on its velocity; and he explained the in- 
variance of c. Thus Lorentz’s continued adherence to his own programme 
after 1905 was completely rational. 

The ontology presupposed by Einstein’s theory is radically different 
from the classical one. Some positivists, among them Bridgman,? claim 
that Einstein had no ‘metaphysical’ commitments, his theory being a 
mere description of actual physical operations. But this is an illusion. 
In his [1905], Einstein implicitly posits a domain of events, each of which 
can be referred to by coordinates (t, x, y, 3) in any one of infinitely many 
equivalent inertial frames. Events are therefore the constituents of the 


1 I do not mean that the Absolute Space Hypothesis was to become testable in isolation, 
but that its addition to existing theories would increase the*number of testable con- 


sequences. 
; 2 Bridgman [1936], chapter II. 
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Einsteinian universe. At the beginning it was very difficult for Lorentz 
to acquiesce in this radical change of world view, especially since his 
theory and Einstein’s explain the same facts. Because of the ‘observational 
equivalence’ of the two theories, one might be tempted to think that 
‘metaphysics’ is irrelevant to this whole issue. We shall however see that 
ontologies, which might be unimportant in the case of individual theories, 
can play an essential heuristic role in the development of research pro- 
grammes by providing different regulative principles.t This can be 
properly appreciated only after examining Einstein’s programme in 
greater detail. 

What I have established so far is that one cannot explain the success of 
Einstein’s Special Relativity Theory in terms of the demerits of Lorentz’s 
rival theory. Lorentz’s programme was non ad hoc in all senses of the 
term. The adjustments to the theory in the 1890’s were not made in the 
light of Michelson’s result and thus were not ad hoc relative to it. The 
adjustments were both theoretically and empirically progressive and they 
were made in conformity to the heuristic of the classical programme. Thus 
if the eventual acceptance by the scientific community of Einstein’s theory 
in preference to Lorentz’s was rational (i.e. if there are acceptable general 
criteria according to which Einstein’s theory was objectively better than 
Lorentz’s), that rationality must lie in the extra merits of Einstein’s theory. 
I now turn to the Einsteinian programme and a consideration of its merits. 
Let me say that I shall argue that the acceptance of Einstein’s programme 
was rational, although, given that Lorentz’s and Einstein’s theories were 
anno 1905 ‘observationally equivalent’, my claim may well appear doubtful 
at this stage. 

(To be continued.) 
The London School of Economics 


1 It should be noted that a ‘metaphysical principle’, as I use the tefm, is now an inte- 
gral part of a scientific research programme and relates either to its hard core or to its 
heuristic (or both). Ontological propositions are, in turn, part of this scientific meta- 
physics. All these components, in the last analysis, are appraised in terms of the over- 
all progress or degeneration of the whole programme. 


Brit. J. Phil. Sci. 24 (1973), 125-152 Printed in Great Britain 125 


The Logic of Subjective Probability 


by BRIAN ELLIS 


Introduction. 

The Logical Correspondence Principle. l 
The Dutch Book Argument and its Limitations. 
Acceptability Conditions. 

Conditionals. 

Notes on the Systems PC* and PRC*. 
Conclusion and Outlook. 


“SI Qwu hW DN 


I INTRODUCTION 


There is a sense in which our logics of truth and of certainty should 
coincide. They should satisfy what I call the logical correspondence principle. 
By a logic of truth I mean a system of logic such as the propositional 
calculus of Whitehead and Russell (PC) the theorems of which are truth 
tautologies, (i.e. propositional formulae all instances of which are proposi- 
tions that are true‘in all possible worlds). By a logic of certainty I mean a 
probability system in which the range of possible probability values is 
restricted to 1 and o. Now I say that a probability system A and a system of 
propositional logic B satisfy the logical correspondence principle if 
(a) P(x) = 1 is a theorem of A iff « is a theorem of B, and (b) P(«) = 1, 
P(B) =1,... > P(y) = 1 is valid in A iff «, B,... >y is valid in B. 

It will be argued here that PC and the Probability Calculus of Kolmo- 
gorov (PRC) do not satisfy the logical correspondence principle. PC 
corresponds only to a fragment of PRC—in particular to the ‘absolute’ 
fragment (PRC,) that we can derive from PRC by deleting the definition 
of conditional probability and all consequent theorems. In other words, 
PC is a logic of truth that corresponds to a probability system which lacks 
conditional probabilities. It can be shown, however, that PRC, is not a 
satisfactory system for assessing the validity or otherwise of arguments 
involving probability claims concerning conditional propositions. For this 
purpose we need a probability calculus that is at least as strong as PRC. 
Therefore, if our systems of logic and probability are to be brought into 
line by the logical correspondence principle, we must find a probability 
calculus PRC* that is at least as strong as PRC and a propositional 
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calculus PC* stronger than PC which corresponds to the logic of certainty 
derivable from PRC*. 

The aim of this paper is to justify these various claims and to consider 
some ways in which such systems might be derived. If PRC were adequate 
as it stood as a logic of probability claims there would be no particular 
difficulty. But there is good reason for thinking that PRC is only a fragment 
of the general probability system that we require. For the validity or 
otherwise of arguments involving compound conditionals (i.e. propositions - 
that contain a conditional within the scope of a propositional connective or 
operator), cannot be adequately assessed using only the apparatus of PRC. 
Therefore, the required system PRC* will need to be stronger than 
PRC. 

When I began writing this paper I was hopeful that the Dutch book 
argument that is used by subjectivists to derive PRC could also be used to 
derive the system PRC*. For, if we define the conditional proposition 
p =q to be any a simple bet on which necessarily has the same pay-off 
conditions as a bet on g made conditionally on p, then the pay-off condi- 
tions for bets on compound conditionals, such as (p > q) v r, should be 
determinable from a knowledge of the pay-off conditions for simple bets 
on p =q and r. And once we know the pay-off conditions for a bet on a 
given proposition we have what we need for the purposes of the Dutch book 
argument. It turns out, however, that rational fair-betting quotients for 
compound conditionals are not probability measures. Hence the Dutch 
book argument cannot be used to derive the required augmented proba- 
bility calculus. 

The argument which leads to this conclusion is of independent interest. 
For the same argument also demonstrates that in general, rational fair- 
betting quotients for semi-decidable propositions (i.e. propositions that 
could become accepted as true, could become accepted as false, and could 
also just remain undecided) are not probability measures. But since most 
interesting scientific hypotheses and nearly all undecided historical 
propositions are semi-decidable, it follows that rational fair-betting 
quotients for such propositions are not probability measures, and hence 
that the Dutch book argument provides no adequate foundation for using 
Bayes’s theorem to determine the posterior probabilities of scientific or 
historical hypotheses. 

In the end, I am able to offer no solution to the main problem that I 
have raised. We have as yet no adequate logic of subjective probability, 
and with the failure of the Dutch book argument even to apply to most 
probability claims concerning propositions of real’scientific or historical 
interest, no such system is in sight. 


The Logic of Subjective Probability 127 
2 THE LOGICAL CORRESPONDENCE PRINCIPLE 


A logic of certainty is a two-valued probability system in which the range 
of possible probability values is restricted to 1 and o. Adopting a sub- 
jectivist viewpoint, it seems that such a system of logic should yield the 
same judgments of validity when it is applied to analysis of arguments as a 
logic of truth. For if we are certain of the premisses of a valid argument, 
we ought to be certain of its conclusion, and if we are not certain of the 
conclusion of a valid argument, then we ought not to be certain of all its 
premisses. In a two-valued probability system with the range of possible 
probability values restricted to 1 and o, if P(a) #1, then P(«) = o. 
Hence if we assign zero probability to the conclusion of a valid argument, 
then in a logic of certainty we must assign zero probability to at least one 
of its premisses. But likewise, if the premisses of a logically valid argument 
are true, its conclusion must be true, and if its conclusion is false, at least 
one of its premisses is false. Consequently, if there is any divergence 
between our logics of truth and of certainty, then either something is 
wrong with our probability theory or with the way that we have applied 
it to the analysis of arguments, or something is wrong with our logic of 
truth. If the applications that we make of our systems of logic and proba- 
bility theory are both sound, then our logics of truth and of certainty must 
coincide in the sense defined. With the analogy of physics in mind, I call 
this the logical correspondence principle. 

It is not difficult to show that if expressions of the form ‘B/«’ can be 
taken to denote propositional formulae then PC does not coincide in this 
sense with PRC. It is true that if « is a theorem of PC then P(«) = 1 is 
a theorem of PRC. But there are many theorems of PRC of the form 
‘P(B/«) = 1’ that do not correspond to theorems of PRC. For example, 
Pav ~Qi(pv ~p)=1 is a theorem of PRC, but (qv ~q)/ 
(p v ~p) is not a theorem of PC. And it is clear that there are many 
valid arguments of the logic of certainty derivable from PRC which have 
no counterparts in PC. So, apparently, our logics of truth and of certainty 
do not coincide in the required sense. 

There would be no violation of the logical correspondence principle if 
expressions of the form ‘B/«’ occurring in probability equations of the form 
‘P(B/«) = 12’ either could not or should not be considered to be expressions 
denoting propositional formulae. But in that case, PRC is stronger than it 
needs to be to serve as a system of logic. If PC is adequate as it stands as 
a logic of truth, then the system PRC, obtained from PRC by deleting the 
definition of conditiohal probability and all consequent theorems should 
be adequate as a logic of probability claims, and we should never have any 
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use for expressions of the form ‘P(8/«) = 1’, where ‘a’ and ‘p’ denote 
propositional formulae. 

It is demonstrable, however, that PRC, is not an adequate logic of 
probability claims, and that to assess the validity or otherwise of arguments 
involving probability claims made in respect of conditionals we must use 
the full apparatus of PRC and represent the i-th degree probability claim 
concerning the proposition that if p then g by ‘P(q/p) = 7’. 

There are only two plausible ways in which the i-th degree probability 
claim concerning the proposition that if p then g might be represented using 
the material conditional of PC for ‘if...then...’. 

(a) P(p >g)=# 

(b) p > (Pq) = +) 

But both fail because of what may be called the probabilistic paradoxes of 
matertal implication. 

The first is unsatisfactory, because it entails that the propositions ‘if p 
then q’ and ‘if ~ p then q’ cannot then both be improbable. For suppose 

(1) P@ >49) <4 

(2) P( ~p >9) <4 
Then we have that 

(3) Pipa ~g)>4 

(4) P~p a ~g) >t 
and hence by addition that 

(5) (~g >t 
It is bad enough that we should have to say that these two propositions 
cannot both be false, but intolerable that we should have to say that they 
cannot both be improbable when we have readily available a way of 
representing the improbability claims concerning these two propositions 
which will enable us to draw the obviously correct conclusion that P(g) < 4. 
From 

(1a) P(g/b) <4 

(2a) P(g) ~~) <4 
we can deduce that 

(3a) P(p ^ @) < EPC) 

(4a) P( ~p Ag) <4P( ~p) 
and hence by addition that 

(sa) PU) <4." 

-The representation (6) avoids this particular paradox, but incurs 
another equally serious, because it entails that if it is certain that if p then g 
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and certain that if ~ p then r. Then it should be held to be certain that q 
or certain that r. Moreover, we should have to say that two conditionals 
with the same consequent cannot differ in probability if their antecedents 
both happen to be (although we may not know them to be) true. For, from 

(6) 2 > (0@ =) 

(7) r > (69) # #) 

(8) par 
we can deduce that: 

(9) P@ =#) ^ (P@ + i) 

But no contradiction follows from [P(q/p) = i, P(q/r) #4, p^ p and 
P(q/p) = 1 and P(r] ~p)=1 do not together ential (P(g) = 1) v 
(P(r) = 1). It therefore appears that ‘g/p’ is the most satisfactory repre- 
sentation that we have of the proposition that if p then q, when it occurs 
within the context of a probability claim. But since certainty claims are 
only limiting cases of probability claims, it would be surprising if ‘P(q/p) 
== 1’ did not turn out to be a better representation of the claim that it is 
certain that if p then q than ‘P(p > q) = r. If so, then there is, even in 
application no coincidence in the sense required between our logics of 
truth and of certainty. Therefore, the systems PC and PRC do not satisfy 
the logical correspondence principle. The systems PC and PRC, do 
satisfy it, but PRC, is not an adequate logic of probability claims. We 
must therefore look for a probability calculus PRC* at least as strong as 
PRC and a propositional calculus PC* stronger than PC both of which 
contain a propositional connective having properties similar to the solidus 
‘P of PRC which can be used in the representation of conditional 
propositions. 

The system PRC* should in fact be stronger than PRC. For PRC 
contains no rules for handling compound conditional propositions (i.e. 
propositions that contain a conditional within the scope of a propositional 
connective or operator). In the standard logical interpretation of PRC, 
P(q/p) is a ratio of probability measures on the propositions p A q and p, 
but P(q/p) is not itself a measure on any proposition. Hence the expressions 
‘P((qip) V 1)’, ‘PU(alp) 07), ‘PC ~ (alp), P((r/q)/Y’ axe not well formed. 
But if ‘/’ is to be interpreted as a propositional connective, then g/p must 
be a proposition, and P(g/p) a measure on that proposition. Hence the 
system PRC will have to be stronger than PRC, and will have to be capable 
of handling expressions such as these. 

It may be objected here that the need for a probability calculus strong 
enough to deal with probability claims concerning compound conditionals 
is more formal than real. For as a matter of fact we seldom do make 
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probability claims concerning such propositions, and when we do, the 
claims we make can usually be construed as probability claims concerning 
propositions that are not compound conditionals. For example, the claim 
that it is probable that both q if p and r if p would appear to be just the 
claim that it is probable that both g and r if p. Hence, instead of writing 
‘P((q/p) ^ (r/p)) > ¥ to represent this claim, as apparently we should, 
we may simply write ‘P((qg A r)/p) > }’. But it is not obvious that such a 
translation into the symbolism of PRC is always possible. How, for 
example, should we represent the claim that it is probable that g iff p? 
And even if a translation into the symbolism of PRC were always possible, 
we could not effect the translations if we did not know what propositions 
the various compound conditionals were equivalent to. The need for a 
probability calculus strong enough for us to be able to express such 
equivalences thus appears to be inescapable. If it is correct to represent 
the claim that it is probable that both g if p and r if p by ‘P((q A DA > Y, 
then 

P((q ^ r)[p) = P((q/p) ^ (ripy) should be a theorem of PRC* and 

(a ^ *)/p) = (lP) ^ (rp) a theorem of PC*. 

Let us now consider one way in which it seems the required systems 
PC* and PRC* might be derived. It is well-known that the Dutch book 
argument can be used to derive PRC. For example, if we consider a book 
consisting of a bet on ~ p, a bet on g made conditionally on p, and a bet on 
p A ~q, where p and q are decidable propositions (see 2 below), and ask 
how the betting quotients should be chosen for such a system of bets if no 
Dutch book is to be possible against either punter or bookmaker, then it 
can be demonstrated that the chosen betting quotients must satisfy the 
relationship ; 
(:—Q( ~ 2) x O@/p) = 1-O( ~ p) ^ ~a). 

Let us define the conditional proposition p => q to be any, a simple bet on 
which necessarily has the same pay-off conditions as a bet on g made 
conditionally on p. Then the same argument must yield the result that if 
no Dutch book is possible the chosen betting quotients for a book on 
{~ p,p >g p ^ ~ gq} must satisfy the relationship 

(1—2 ~p) xO > 9) = 1-O( ~p) ^ ~g) 

Moreover, once we know the pay-off conditions for bets on conditional 
propositions, we can easily determine the pay-off conditions for bets on 
compound conditionals. For example, we might suppose that a bet on 
(p =q) A (p = 7) would be paid only if bets on both p = g and p =r 
would be paid, lost is bets on either p > g or p =r would be lost, and 
. returned if bets on either p = g or p > r would be returned. But once we 
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know the pay-off conditions for bets on compound conditionals we know 

all we need or the purposes of the Dutch book argument. Therefore, it 

ought to be possible to use the Dutch book argument to determine 

(a) the class of analytically certain compound conditionals (.e. the class of 
compound conditional propositions on which bets can be won but 
cannot be lost), and 


(b) the relationships which must hold between the chosen betting quotients 
for various compound conditionals and other propositions if no Dutch 
book is to be possible. 


But unfortunately the Dutch book argument cannot be used in this way. 
It can certainly be used to determine rational betting quotients for com- 
pound conditionals so defined. But it is demonstrable that such betting 
quotients are not probability measures, Indeed it can be demonstrated that 
in general rational betting quotients for semi-decidable propositions (i.e. 
propositions that can become accepted as true can become accepted as 
false and can just remain undecided) are not probability measures. This 
result is of such fundamental importance for subjective probability theory 
that I shall devote most of the following section establishing it. 


3 THE DUTCH BOOK ARGUMENT AND ITS LIMITATIONS 


There are two forms of the Dutch book argument, D1 in which a Dutch 
book is defined as a system of bets on which one of the participants must 
lose, and Dz in which it is a system of bets on which one of the participants 
may lose but cannot win. A conditional bet is normally one which may 
remain undecided, ¢.e. neither won nor lost. Therefore, if we wish to consider 
books which may include conditional bets, it will rarely be the case that 
such books will be of the kind D1. Therefore, the argument D2 is appro- 
priate for books of this kind. 

The structure of the argument D2 is as follows: let Q(p) be the ratio of 
outlay to expected return for a finite non-zero bet on p, and call O(p) the 
betting quotient for this bet. By a finite bet I mean any in which neither 
outlay nor expected return is infinite. By the expected return I mean the 
amount, including the initial stake that would be returned to the punter if 
the bet were won. By a non-zero bet I mean one in which the expected 
return is non-zero. For such bets Op is always well-defined and o < QO(p) 
< 1. For any set of bets there will be a set of possible returns. If the total 
outlay is not less than any of the possible returns, but greater than some 
then we have a Dutch book against the punter. If the total outlay is not 
greater than any of the possible returns, but is less than some, then we have 
a Dutch book against the bookmaker. 
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The Dutch book argument D2 then demonstrates that no Dutch book of 
finite, non-zero bets on any set of decidable propositions closed under 
negation, conjunction and disjunction is possible only if the betting 
quotients chosen for the various propositions are probability measures 
(i.e. measures that satisfy the axioms of PRC), a decidable proposition 
being any that must in the context become accepted as true or false. 

A set of betting quotients is said to be consistent iff no Dutch book with 
these betting quotients is possible. A given bet is said to be fair to a given 
individual at a given time iff he would have no preference between the 
roles of bookmaker and punter with respect to that bet. A betting quotient 
is said to be fair if the individual would consider any bet with that betting 
quotient to be fair. And the probability of a given proposition to a given 
individual at a given time is usually defined to be the betting quotient that 
he would consider at that time to be a fair one for that proposition. 

With these definitions in mind, the subjectivist advocates the following 
ideal of rationality: A rational man is one whose probability assignments at 
any given time are consistent. He would attempt to justify acceptance of this 
ideal by appealing to the principle that if one has the choice it would be 
irrational to accept any set of commitments as a result of which one may 
lose something, but cannot gain anything that one values. The subjectivist 
also advocates the following definition of validity: An argument is valid iff 
an ideally rational man cannot at the same time accept its premisses and 
reject its conclusion. This definition of validity is strictly applicable only to 
arguments all premisses and conclusions of which are probability claims. 
But clearly, by using the logical correspondence principle, it can also 
be applied to arguments involving truth and falsity claims. Classically, an 
argument is said to be valid iff there is no possible world in which its 
premisses are true and its conclusion false. It follows that if we wish to 
understand any statement sufficiently for all of the purposes of classical 
logic we need to know its truth conditions in all possible worlds. Therefore, 
if we wish to assess the validity of arguments involving probability claims, 
using the classical definition of validity, we need to know their truth 
conditions in all possible worlds. The subjectivist does not need such an 
analysis of probability, because he accepts a wider concept of validity. 

The Dutch book argument can also be used to determine rational 
betting quotients for semi-decidable propositions, t.e. propositions that 
could become accepted as true, could become accepted as false, and could 
just remain undecided. But is it not difficult to show that rational fatr- 
betting quotients for semi-decidable propositions are not probability measures. 
‘ Consider a book on the set { ~p, p ^q, p A ~q} where p and q are 
. semi-decidable. Let the stakes be S( ~ p), S(p A q), S(p A ~ q) and the 
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betting quotients O( ~p), O(p ^g) O(b ^ ~g). Suppose OC ~ p), 
QO(p A q) and O(p A ~ q) + ©, and that p is compatible with both q and 
~ q. Then the total outlay: L = S( ~ p)+S(p a g)+S(p A ~g). 
The possible returns are: 


RS 5 ne (where T ~ p) 

R= RAEE (where Tp and Tq) 
R,= TEET. (where Tp and T ~ q) 
R,=L (where xp and xq) 


R= S(~ p)+S(p ^ 9) (where ap aqd Tg) 

Re = S(~ p)+S(P A ~g) (where xp and T ~q) 

R, = S(p A q)+S(p ^ ~q) (where Tp and xq) 
where ‘T?’ = ‘becomes accepted as true’ and ‘x’ = ‘remains undecided’. 

If L < Ryo Ri... R, and L < at least one of R,, R,...R, then from 

Ry, Re and R, L = o. Hence R, Ra Rs, Ra = 0, and there is no possible 
return that L is less than. Hence no Dutch book is possible against the 
bookmaker. If L > R, Ra... R, and L> at least one of R, Rg... R; 
then since, necessarily, L > Ry, Rs, Rg, Ry if L # o we require only that 
L > R, Ra and Ry. If L > R, Ry and Ry then 


QO ~ p)xL+(Q(p A DXL) HOG ^ ~4g)xZ) 
> SC ~p)tS(p A DHS a ~ g), 


OC ~pt ADH 4 ~g) >t 
Therefore if o < O( ~ p)+Q(p A g)+O(p A ~g) <1 no Dutch book 
is possible against either punter or bookmaker if p and q are both semi- 
decidable, and p is compatible with both g and ~ g. 

If O( ~p)+O(p A g)+O(p © ~ g) = 1 then choose L ¥ 0, S( ~ p) 
=LxQ(~p) and S(p Ag) =LXQ(p ag). Then S(p A ~g) = 
LXQ(p A ~q) and L = R, Rẹ Rẹ and R, and L > Rp Rẹ and R, 
and L > at least one of Ry, Re, and R,. Therefore, such a choice of stakes 
will constitute a Dutch book against the punter. If for a given choice of 
betting quotients for a given set of propositions a Dutch book can be made 
out, then a Dutch book can be made out for these betting quotients on 
arty set of propositions which includes the given set by choosing the stakes 
for bets on all other propositions = o. Therefore, if the betting quotients 
chosen for { ~ pf, p A g,p A ~q} are probability measures, a Dutch 
book is always possible on any set of propositions which includes p and gq . 
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and is closed under negation, conjunction and disjunction. Rational fair- 
betting quotients for semi-decidable propositions are therefore not probab- 
ility measures. 

It follows that the subjectivist’s definition of personal probability must 
be rejected, or restricted in its application to decidable propositions. 
This is indeed a very severe limitation on the power of the Dutch book 
argument. For it follows that it cannot be used to determine whether a 
given set of probability assignments to a given set of propositions is 
consistent if any member of the set is semi-decidable. But if g/p is to be 
construed as a proposition a simple bet on which necessarily has the 
same pay-off conditions as a bet on q made conditionally on p, then q/p 
must, except in special cases, be interpreted as a semi-decidable proposi- 
tion. Hence, the Dutch book argument D2 cannot be used to determine 
the relationships that should hold between rational probability assignments 
to compound conditional propositions. 

It may be possible to avoid this conclusion by modifying the betting 
practices relative to which the concepts of fair-betting quotient and Dutch 
book are defined. For example, we may adopt the non-standard betting 
rule that if any constituent bet in a book remains undecided the whole 
book remains undecided, and all stakes are returned. If this rule is adopted, 
then in the above example the only possible returns are R,, Rẹ Ra, and 
Ra and the Dutch book argument D2 yields the familiar result that if no 
Dutch book is possible on ~ p, p A q, and p A ~q the chosen betting 
quotients must satisfy the relationship O( ~ p)+Q(p A g)+O(p A ~q) 
=I. But not only is it unclear why such a non-standard betting rule 
should be adopted, it also blocks the derivation of the multiplication 
theorem. 

Relative to standard betting practices the possible returns for a book on 
~p, p>q, pA ~g, where p =q is by definition a proposition, a 
simple bet on which necessarily has the same pay-off conditions as a bet 
on g made conditionally on p are: 


R, = AET ZA si >q) (where T( ~ p)) 





R= ae zo (where T(p) and T(q)) 
R= EAT (where T(P) and T ~ 9) 


where the betting quotients are non-zero and p-and q are decidable 
. propositions. 
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If L > R, R, and R, and L > at least one of R, R, and Rẹ then 
o ~ 2) > L224 0( ~ p)xS eo 
O(p + 9) > BF 
Op a ~q > Seo) 


where at least one of these relationships is a strict inequality. Therefore 

by simple algebra 

; (x—Q( ~) xQ = 4) > 1—0 ~p) ^ ~g) 
Similarly, if L < R, Ra and R, and L < at least one of R,, R, and Rs, 


then 
(t—Q( ~ px > g) <1-O( ~ p) A ~ 9) 
So that if 
(1—Q( ~ ~))XQO( > 9) = 1-O( ~ p) ^ ~g) 
no Dutch book is possible where p and q are decidable and the betting 
quotients are non-zero. 

But relative to the proposed non-standard betting practice, the only 
possible returns are L, R, and R,, and the Dutch book argument does not 
yield the multiplication theorem. The conclusion that the Dutch book 
argument cannot be used to determine the relationship which should hold 
between rational probability assignments to compound conditional 
propositions thus appears to be inescapable. 

I see no way of avoiding the conclusion that our logics of truth and 
certainty should satisfy the logical correspondence principle. Nor do I 
think that the fragment PRC, of PRC can be accepted as an adequate 
logic of probability claims. The inference from 

(a) There is a 50 per cent chance of Kissinger being made Ambassador 
to China if McGovern is elected, and 
(b) There is a 50 per cent chance of Kissinger being made Ambassador 
to China if McGovern is not elected, 
to (c) There is no chance of Kissinger being made Ambassador to China, 
is as plainly invalid as any inference could be. But if ‘>’ is an adequate 
representation of ‘if...then...’ within the context of a probability 
claim, this argument must be considered to be valid, since P(p > q) = 
P( ~p >q9 =4—>P(Q = o, is a valid inference in PRC,. Therefore, 
I see no way of avoiding the conclusion that any adequate logic of proba- 
bility claims must be at least as strong as PRC. Therefore, by the logical 
correspondence principle, PC (which corresponds to PRC, only) is not 
an adequate logic of truth claims. 
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It may be objected that my argument here is really only a modified 
version of the old argument that PC is unsatisfactory because of the 
paradoxes of material implication. As a matter of fact I do not consider 
this to be a bad argument, but to understand me in this way is to miss 
the whole point of my strategy. What I think I have shown is that no 
man in his right mind would use ‘>’ to represent ‘if... then...’ within 
the context of a probability claim if he is trying to decide whether a given 
argument in which this probability claim occurs is valid or not. For PRC 
provides us with a way of representing this probability claim which is 
demonstrably far more satisfactory. To use the impoverished probability 
system PRC, when we could use the full apparatus of PRC to assess the ` 
validity of such arguments would be as silly as using Aristotelian physics 
to plan a trip to the moon. If this conclusion is correct, then the inadequacy 
of PC as a logic of truth follows in one step from the inadequacy of PRC, 
as a logic of probability. 

Since the Dutch book argument has been shown to be useless as a tool 
for deriving the required systems PC* and PRC*, the question arises how 
we should proceed. I am reluctant to abandon altogether the subjectivist 
approach that I have ‘adopted. For if we return to the classical definition 
of validity and adopt a classical approach to the problem of compound 
conditionals we shall not only have to find an analysis of probability 
statements in terms of their truth conditions in all possible worlds, we 
shall also have to have such an analysis of compound conditionals. And I 
am not at all hopeful that any such analyses can be found. 

The approach that I now wish to investigate is one that preserves the 
subjectivist’s concept of validity, and proceeds to define the ideal of a 
rational man by setting out acceptability conditions for the various kinds 
of propositions that we have to deal with. Roughly, the idea is to define 
the various propositional connectives and operators not in terms of truth 
conditions, but in terms of acceptability conditions, and then use these 
definitions to discover the class of propositions that can rationally be 
accepted as true, but cannot rationally be accepted as false. I will assume 
that there exists a set of propositions that are semi-decidable, and I will 
allow my propositional variables to range only over such propositions. 
Then if we have a propositional formula, every instance of which can 
rationally be accepted as empirically true, and no instance of which can 
rationally be accepted as empirically false, we may say that this proposi- 
tional formula is tautologous, and is a theorem of the required augmented 
propositional calculus PC*. In other words, I propose to make strict 
ceherence the criterion for tautologousness and hence for theoremhood. 
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4 ACCEPTABILITY CONDITIONS 


The propositional connectives and operators of PC are normally defined in 
terms of truth conditions in possible worlds. Thus, the propositional 
disjunction of p and g is any proposition that is true just in case at least 
one of p and g is true and false iff both p and g are false. But if we are to 
use the subjectivists’ concept of validity, we must try to define these 
connectives and operators in terms of acceptability conditions. 

To formulate satisfactory rules for this purpose I make a distinction 
between necessary and empirical truth claims. If I make a necessary truth 
claim concerning some proposition, t.e. say something of the form ‘It is 
necessarily true that p’ then epistemologically I become committed to 
defending the truth claim concerning this proposition in a certain kind of 
way, viz. logically. Therefore, in making a necessary rather than a simple 
truth claim, I have as it were tied my empirical hand behind my back. 
For example, to defend the claim that it is necessarily true that Caesar 
either winked or did not wink as he crossed the Rubicon, it would be 
irrelevant to produce conclusive empirical evidence that Caesar did in 
fact wink. For only logical considerations are relevant to the defence of a 
necessary truth claim. Now what is needed here is a statement-forming 
operator on propositions similar to ‘It is necessarily true that’, which, as it 
were, would tie our logical hands behind our backs leaving our empirical 
hands free. For this purpose I will use the expression ‘It is empirically 
true that’ and stipulate that in defence of an empirical truth claim only 
considerations that are not purely logical are relevant. So that, for example, 
to justify the empirical truth claim that Caesar either winked or did not 
wink as he crossed the Rubicon it would be irrelevant to point out that 
this proposition is a tautology. Rather it would need to be shown either 
that Caesar did in fact wink, or that he did not in fact wink as he crossed 
the Rubicon. 

It is important to be clear that ‘It is empirically true that’ does not have 
the same function as the operator ‘It is contingently true that’. For the 
contingent truth claim concerning p is inconsistent with the necessary 
truth claim concerning p, but the empirical truth claim concerning p is 
consistent with the necessary truth claim concerning p. Any justification 
for accepting the contingent truth claim concerning p is a justification 
for the empirical truth claim concerning this proposition, but not con- 
versely. i 

Given this concept of empirical truth, we can now attempt to formulate 
acceptability rules fer propositional negations, conjunctions and dis- 
junctions: 


~ AIT E ZX vIT E X 
TIF TITFX T TTT 
FIT FIF FF FIT FX 
XIX XIX FX X|T XX 


‘T’ = ‘is accepted as empirically true’ 

‘F’ = ‘is accepted as empirically false’ 

‘X’ = ‘remains empirically undecided’ 
Then if a tautology is defined to be any proposition that can become 
accepted as empirically true and cannot become accepted as empirically 
false, it is easily demonstrated that if the variables of PC are allowed to 
range only over propositions that are empjrically semi-decidable (i.e. as 
we are now using the term, propositions that can become accepted as 
empirically true, empirically false and can just remain empirically un- 
decided) then every instance of every theorem of PC is a tautology. 

Doubts arise, however, over the final entries in the empirical accepta- 
bility tables for ‘A’ and ‘v’. For these entries entail that if p remains 
empirically undecided then both p A ~pandpv ~ p must also remain 
empirically undecided. But as the concepts of empirical truth and falsity 
have been defined, this is what is to be expected. The proposition that 
Caesar both winked and did not wink as he crossed the Rubicon is logically, 
but not empirically false. For the falsity claim concerning this proposition 
is, so far as I know, defensible only logically. But such considerations are, 
by definition, irrelevant to the acceptability of empirical truth or falsity 
claims. 

A more serious difficulty arises when we consider a proposition such as 
“The coin will land either heads or tails’. For we may wish to accept this 
proposition as empirically true, even though we would not accept either 
disjunct as empirically true. The reason would appear to be that in this 
case we would accept the following conditionals as empirically true: 
‘If the coin does not land tails, then it will land heads’ and ‘If the coin does 
not land heads, then it will land tails’. But then, when we enquire under 
what conditions we should accept these conditionals as empirically true, 
when both their antecedents and consequents remain empirically un- 
decided, we find that we cannot do better than say that these conditionals 
should be accepted as empirically true under these conditions only if the 
original disjunction would be so accepted. Therefore the observation that 
where Xp and Xq, the disjunction p v g should be accepted as empirically 
true only if certain conditionals are so accepted, while correct, is unhelpful 
at this stage. And for the time being, we cannot do better than accept the 
. following rules: 
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~ A F X vIT E X 
T| F T FK TIT T T 
FIT F F F FTF X 
x|x X|X F (FX) XTX D 


where the bracketed entries in the last positions of the tables for ‘ A’ and 
‘v’ indicate that the propositions concerned may have either acceptability 
value. It can still be demonstrated that these tables constitute a model for 
PC, and hence that every instance of every theorem of PC is a proposition 
that can be accepted as empirically true, but cannot be accepted as empiri- 
cally false. 

Doubts may also arise about the proposed definition of tautologousness, 
since it does not appear to follow from this definition that a tautology must 
be accepted as true, only that it cannot rationally be accepted as false. But 
all that follows from the definition is that a tautology need not be accepted 
as emptrically true. Empirically, it may remain undecided. But since we 
know, as a result of analysis, that a decision could only go one way in the 
case of a tautology, a rational man will certainly accept that proposition 
as true. But he will accept it as logically rather than empirically true. 

A more troublesome objection to the proposed definition of tautologous- 
ness is that we cannot show by means of these acceptability tables that 
logically equivalent propositions must have the same acceptability value. 
It can be shown that if one side of a logical equivalence is accepted as 
empirically true, the other cannot be accepted as empirically false, and 
conversely, but it cannot be demonstrated, by means of these tables, that 
if one side is accepted as empirically true or false the other must also be. 
For example, the assignments X(p), T(q), X(r) are consistent with 
pv (q ^r) being accepted as empirically true and (p v q) A (pv r) 
remaining empirically undecided. However, we know that (p v (q Ar)) = 
((p v q) A (p v r)) can be accepted as empirically true, but cannot be 
accepted as empirically false. Moreover, we can demonstrate that if 
a = $ is accepted as empirically true, then « and 8 must have the same 
empirical acceptability value. Hence if one side of a logical equivalence is 
accepted as empirically true, there is no point in remaining undecided about 
the other, since a decision could only go one way. Therefore, any evidence 
that is sufficient for establishing the empirical truth of one side of a logical 
equivalence is sufficient for establishing the truth of the other. In this 
particular example, the considerations which would lead us to accept the 
truth of (p v q) A (p v r) are a mixture of logical and empirical. If we 
accept the empirical truth claim concerning p v (g A r) we must accept 
the truth claim concerning (p v q) A (p v r}. Where such a mixture of | 
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logical and empirical considerations would justify the acceptance of a 
truth claim, we shall consider the claim to have been empirically established. 

To formalise this, we may supplement the proposed acceptability tables 
with the rule that if « = 8 is a materjal equivalence that can be demon- 
strated to be tautologous, then œ and 8 must have the same empirical 
acceptability value. It follows that in any analysis of the acceptability 
conditions for compound propositions logically equivalent propositions 
may always be substituted for each other. 

It should be noted that the proposed acceptability tables apply also 
where ‘T’, ‘F’ and ‘X’ are interpreted as ‘is accepted as necessarily true’, 
‘is accepted as necessarily false’ and ‘is accepted as neither necessarily true 
nor necessarily false’ respectively. And if we agree to say that a proposition 
must be accepted as necessarily necessarily true if it can be accepted as 
necessarily true, but cannot be accepted as necessarily false, then since 
every tautology must be accepted as necessarily true, it must also be 
accepted as necessarily necessarily true. The proposed acceptability tables 
should thus provide a foundation for a modal system as well as for a 
logic of truth and falsity claims. 


5 CONDITIONALS 


The suggestion of 1 of this paper was that we should define the conditional 
proposition p = g to be any simple bet on which necessarily has the same 
pay-off conditions as a bet on q made conditionally on p. Given that a 
simple bet is to be paid only if the proposition concerned becomes accepted 
as empirically true, this is equivalent to the following acceptability table 
for ‘=>’: 


>|T F X 
TIT F X 
FX X X 
XIX X X 


But this suggestion was altogether too naive. There is no doubt that within 
the context of a probability claim ‘=’ so defined is a far better representa- 
tion of ‘if...then...’ than ‘>’. But clearly a conditional can become 
accepted as empirically true even if its antecedent and consequent are 
accepted as empirically false or remain empirically undecided. If we 
accept the above table and define a self-contradiction to be any proposition 
that can become accepted as empirically false, but cannot become accepted 
as empirically true, then it turns out that (p =q) A ~ q is self-contra- 
dictory, which is absurd if ‘=’ is to be interpreted-as a representation of 
. Sf...then...’. A better acceptability table for ‘>’ is the following: 
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bq a AQ] Y 
| 
| 
| 


TF — 


where ‘—’ is to be read as ‘may have any acceptability value’. On it, 

(p =q) A ~q is at least consistent. But even this revised acceptability 

table is unsatisfactory because it cannot reasonably be assumed that every 

conditional the antecedent of which is accepted as empirically false is 
independent of every other such conditional. The acceptability value of 

p => ~q, for example, cannot be independent of that for p > q, even if 

p is accepted as empirically false. But this revised acceptability table 

requires us to treat all subjunctive conditionals as though they were mutually 

independent propositions. 

I want to say that a rational man will accept p = q as empirically true iff 
(a) he does not accept p as false, but accepts p > q as empirically true, or 
(b) he accepts p as empirically false, but considers that there is a coherent 

system of beliefs about the world which includes both 7(p) and T(q) 
which is nearer to or more in conformity with his actual system of 
beliefs than any which includes T(p) and either X(q) or F(q). 

Likewise, I want to say that a rational man will accept p > q as empiri- 
cally false iff l 
(a) he does not accept p as false, but accepts p > ~ q as empirically true, 

or, 

(b) he accepts p as empirically false, but considers that there is a coherent 
system of beliefs about the world which includes both T(p) and F(q) 
which is nearer to his actual system of beliefs than any which includes 
T(p) and either X(q) or T(q). . 

“In any other case, a rational man will remain empirically undecided about 

p =q, ie. his system of beliefs will include X(p = q) iff 

(a) it includes X(p > q) and X(p > ~ q), but not F(p), or 

(b) it includes F(p), but neither T[p]g nor T[p] ~q 

where ‘T[p]g represents the belief that there is a coherent system of 

beliefs which includes both 7(p) and T(g) which is nearer to his own 

belief set that any which includes T(p) but not T(q). To give substance to 
this definition, the concept of ‘nearness’ must be explicated. I assume that 
thé relationship of nearness has the formal properties of the spatial 
relationship of nearness in a flat space of arbitrarily many dimensions, s0 
that a belief system may be represented by a point in such a space in such 
a way that we may speak unambiguously of the epistemological distance 
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between any two belief systems. I assume that every coherent belief system 
is at a finite epistemological distance from every other such system, but 
that any incoherent belief system is more remote from any coherent one 
than any other coherent belief system. By a coherent belief system I mean 
any that can be held by a rational man. I assume that for a rational man a 
judgment of relative epistemological distance is a subjective matter subject 
to these formal requirements, and to the following: if a coherent belief 
system B contains X(p) and 7(q), and T(p) is coherent, then there(is a 
coherent belief system containing T(p) and T(q) that is nearer to B than 
any that contains 7T(p) and either X(q) or F(q). 

It follows from this definition that if p is accepted as empirically true, 
then p = q will be accepted as empirically true, false or remain empirically 
undecided according as q is accepted as empirically true, false or remains 
empirically undecided. It also follows that if T(p) is coherent and p 
remains empirically undecided, then p => g will be accepted as empirically 
true or false according as g is accepted as empirically true or false. The 
definition thus agrees as it should with the second proposed acceptability 
table for ‘=>’. It is stronger than the table, however, since for example, it 
follows from the definition that p > ~ q is the propositional negation of 
p > qand that p > (g A r) and p = (g v r) are respectively the proposi- 
tional conjunction and disjunction of p = q and p >r. 

The proof of the latter is: Suppose a rational belief system B, contains 
F(p) and T(p + (q ^ r)). Then by definition there is a coherent belief 
system B, that contains T(p) and T(g ^ r) that is nearer to B, than any 
that contains T(p) and X(q ^ r), or T(p) and F(g A r). Since By is coherent 
it must contain T(p), T(g) and T(r). If a coherent system of beliefs contains 
Xq then it must also contain X(q Ar) or F(q A r). Therefore, B, is 
nearer to B, than any system of beliefs that contains both 7(p) and X(q). 
Similarly, if it contains X(r), then B, is nearer to B, than any that contains 
both T(p) and X(r). If a coherent system of beliefs contains F(q) then it also 
contains F(q ^ r), and if it contains F(r) then it also contains F(q A r}. 
Therefore B, is nearer to B, than any coherent belief system that contains 
either X(g), X(r), F(q) or F(r). B, is nearer to B, than any incoherent 
belief system. Therefore, if B, contains F(p) and T(p = (q ^ r)) it must 
also contain both T(p = q) and T(p => r) and hence T((p =q) ^ (p = 7). 

Suppose that a coherent system of beliefs contains F(p) and both 
T(p =q) and T(p =r). Let B, be any coherent system of beliefs that 
contains both T(p) and T(q) that is nearer to B, than any that contains 
T(p) and either X(q) or F(q). We know that B, exists. Let B} be any co- 
herent system of beliefs that contains T(p) and 7 {r) that is nearer to B, 
. than any that contains T(p) and either X(r) or F(r). B, also exists. Let By 
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be any coherent system of beliefs that contains T(p) that is at least as near 
to B, as B, or Bs. Then B, exists, contains T(p), T(q) and T(r), and is 
nearer to B, than any coherent belief system that contains T(p) and either 
X(q), X(r), F(q) or F(r). If a coherent belief system contains T(p) and 
X(q Ar) or F(q ^ r), then it contains T(p) and at least one of X(q), X(r), 
F(q) and F(r). Therefore B, is nearer to B, than any coherent system of 
beliefs that contains both 7(p) and either X(p ^ q) or F(p ^ q). B, is 
nearer to B, than any incoherent belief system. Therefore, any coherent 
belief system that contains F(p) and T((p =q) A (p = r)) also contains 
T(p > (q ^ 7): 

Next, suppose a coherent belief system contains either T(p) or X(p) and 
T(p > (q ^ r)). Then, by definition it contains T(p > (q A r)), and hence 
both T(p >q) and T(p >r). But a coherent belief system that contains 
either T(p) or X(p) and T(p > q) also contains T(p = q). Therefore any, 
coherent belief system that contains T(p) or X(p) and T(p = (q A r)) also 
contains T((p =q) A (p = 1)). On the other hand, if a coherent belief 
system contains either T(p) or X(p) and both T(p = q) and T(p = 1), 
then it contains both T(p >q) and T(p >r) and hence T(p > (¢ ^ 1)) 
and T(p =(q A1r)). Therefore, quite generally, any coherent belief 
system that contains T(p = (q ^ r)) contains T((p > q) A (p => r)) and 
conversely. 

It does not follow from this that p = (q A r) is the propositional con- 
junction of p = g and p = r. To demonstrate this, it is also necessary to 
show that F(p = (q A r)) will be an element of a coherent belief system iff 
F((p = q) A (p > r)) is an element of such a system. We could demon- 
strate this easily enough if we could show that T(p = (q v r)) will be an 
element of a coherent belief system iff T((p > q) v (p = 1)) is an element. 
And this in turn could be demonstrated if we could show that there is no 
coherent belief system in which both 7(p > (q v r)) and F((p =q) v 
(p =r)) or both F(p =(q v r)) and T((p=> 9) v (p =n) occur. If 
F((p = q) v (p =n) occurs, then so does F(p = q) and F(p =r), and 
hence T(p = ~ q) and T(p = ~r) must also occur. But from what we 
have already established if T(p > ~ q) and T(p = ~ r) occur, then so 
must T(p >(~qaA ~ r)) and hence F(p = (q v r)) must occur. Therefore 
there is no coherent belief system in which both T(p = (q v r)) and 
F((p = q) v (p = r)) occur. On the other hand, if F(p = (q v r)) occurs, 
then F(p =q) and F(q =r) must both occur, and hence F((p = q) v 
(p = r)) must also occur. Therefore, there is no coherent belief system in 
which both F(p > (q v r)) and T((p =q) v (p = r)) occur. Therefore, 
T(p = (q v r)) will occur as an element in a coherent belief system iff 
T((p > q) v (p = r)) occurs. We thus obtain that p > (q A r) is the - 


L 
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propositional conjunction of p > g and p =r and that p > (g v r) is 
the propositional disjunction of p > g and p >r. 

This important result enables us to regard an empirical truth or falsity 
claim concerning a conditional whose antecedent is accepted as empirically 
false as a kind of conditional empirical truth or falsity claim concerning its 
consequent. For if ‘T[a]f’ represents the belief that there is a coherent 
belief system that contains both T(«) and T(8) that is nearer to one’s own 
belief system than any that contains T(«) but not T(8), ‘F[«|p’ the belief 
that there is a coherent belief system that contains both Ta and Ff that 
is nearer to one’s own belief system than any that contains T(«) but not 
F (8), and ‘X]a]p’ the belief that there is no coherent belief setem such 
that either 
(a) it contains both T(«) and T(8) and is nearer to one’s own belief system 

than any that contains T(«) but not T(£), or 
(b) it contains both T(«) and F(f) and is nearer to one’s own belief system 
than any that contains T(«) but not F(f), 
then ‘Tief, ‘Faf and ‘X[«]’ have properties formally analogous to the 
first order acceptability predicates ‘T’, ‘F’ and ‘X’. 

For the purposes of analysis in terms of acceptability conditions, we 
may adapt the ‘semantic tableaux’ or ‘tree’ method. Then the information 
that we have so far derived about coherent belief systems may be sum- 
marised in the following tableaux: 


(1) T ~ a— Fa 
(2) E ~ a—— Ta 
(3) X ~ a— Xa 





O Neža 
(5) T(« a §) Tv, TB 
(6) Fle a p HFa 

= 
"Xa, XB, F[a]ß, F[8]x 
(7) X(« A B}—-——- Ta, XB 
——Xa, TB 
= XB, N[«], ~ B 
Xa, XB, NIS], ~a 
(8) Tia v Po Ta 
TA 
Xa, XB, TI ~ Bla, TL ~ oP 
(9) F(a v 6) ———Fa, FE 
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— Xg, FB 

——Xa, XB, N[ ~a]B 
Xa, XB, NT Di Bla 
(11) Te => gz- TB, T[e]« 


(10) X(a v T XB 








Fa, T[a]B 
—— Xa, XB, T(« >P) 
(12) F(a > B} ~ o, FB, Tala 
— Fg, F[a]p 
. — Xa, XB, T(a > ~P) 
(13) X(« = B)—_—_—_ Ta, XB 
— Fa, X[aJB ` 
—— Xea, XB, X(« > B), X(« > ~B) 


These are the tableaux for the first order acceptability predicates ‘T’, 
‘F’ and ‘X’. Corresponding to these there is a set of tableaux for second 
order acceptability predicates in which ‘7[«]’, ‘F[«]’ and ‘X[«]’ replace 
‘T’, ‘P and ‘X’, and ‘P and ‘y’ replace ‘a’ and ‘R’ respectively, and in 
which ‘T[a}p’, ‘F[a]’ and ‘X[«]p’ are replaced by the third order accepta- 
bility predicates ‘T[a, 8}, ‘Fla, p} and ‘Xo, £7. The tableaux for third 
and higher order acceptability predicates can be constructed analogously. 

Fully expanded the belief represented by ‘T[«, B}y’ is the belief that there 
is a coherent belief system B, containing T(«) and a coherent belief system 
B, containing both T(8) and T(y) that is nearer to B, than any that contains 
T(8) but not 7(y), and B, is nearer to one’s own belief system than any 
that contains 7(«) and relative to which there is no belief system that 
contains 7(f) and T(y) that is nearer than every belief system that contains 

. T(B) but not T(y). I doubt really whether we have much use for such | 
subjunctive, subjunctive conditionals as these. But I suppose we should 
recognise their possibility. In most cases where we say something of the 
form ‘If p, then if g then r’ we-are just making the claim that if both p and q 
then r. But given the definition of conditionalisation, it cannot be shown 
that T[«, 8}y will occur in a coherent belief system iff T[a A B}y occurs. 

The various tableaux are to be understood in the following way: the 
belief that is represented by the analysans will occur in a coherent belief 
system iff at least one of the sets of beliefs represented in the analysandum 
occurs. I assume that at least one, but no two of ‘T(«)’, ‘F(«)’, and ‘X(a)’ 
will occur in any coherent belief system, and hence that any branch of any 
tree in which any two such expressions occur must be considered to be 
closed. There are analogous closure rules for second and higher order 
acceptability predicates. I have introduced the acceptability predicate ‘N’ . 
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as an abbreviation only. In any analysis it can be replaced by the branching 
tree 

X 

a 





It is a convenient abbreviation, however, since any branch of any tree must 
be considered to be closed if it contains both T(«) and N(«). The abbrevia- 
tion ‘N[«]’ is introduced analogously and a similar closure rule applies. 

It follows from the definition of conditionalisation that T[«]« will occur 
in a coherent belief system iff it is believed that there is a coherent belief 
system in which T(«) occurs. ‘T[a]«’ thus represents the belief that « is 
logically possible. It also follows that ‘F[a]«’ represents the belief that there 
is a coherent belief system in which both T(«) and F(«) occur such that, 
etc. F[a]« cannot therefore occur in any coherent belief system, and any 
branch of any tree that contains such an expression must be considered 
to be closed. Finally, X[«]« turns out to be the belief that there is no co- 
herent belief system in which Ta occurs. ‘X]|«]«’ thus represents the belief 
that « is not logically possible. The belief that « is logically necessary will 
therefore occur in a coherent belief system iff both T[a]« and XT ~ a] ~a 
occur. 

In addition to the acceptability tableaux the following inference rules 
can also be justified. 

(T) If there is no coherent belief system in which both T(«) and F(8) 
occur, and T(8) is coherent, then if T(«) occurs in a given coherent belief 
system, 7(8) also occurs. 

(2) Ifa = Bis tautologous, « and 8 have the same empirical acceptability 
value in any coherent belief system, and ‘«’ may be substituted for ‘fp’ in 
any context. 

(3) If a coherent belief system contains 7[«]8, T[B]« and T[a}y, then 
it also contains T[B}y, i.e. 


T[i], TIS], Tiel > TEW 

where the arrow indicates that what is being offered is an inference rule, 
not an analysis. 

(4) Tle) > Tola 

(5) Tle] > Tle A gie A P) 

(6) Tle A Bly > Tale, TIBE, Tiy 

(7) The, Bly > Thala, TIB AYIB Ay) 

© Te A B Tale > Tiy 

alp, 4 [afy ——> fla A 
(9) TJ, Tlely EEA 
. (x0) T(« > f) > T[«]£. 
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No doubt other rules can be justified, but these should suffice for the 
derivation, at least in broad outline, of the required system PC*. 


6 NOTES ON THE SYSTEMS PC* AND PRC* 

Let p, g,7,... be propositional variables ranging over propositions that are 

empirically semi-decidable, and a, f,y,... be propositional formulae 

constructed from p, g,7r,... by the operations of propositional negation, 
conjunction, disjunction and conditionalisation. Then we have: 

(a) If « is a theorem of PC then « is a theorem of PC*. 

(b) If £ is a theorem of PC* and T(«) is coherent, then « = £ is a theorem 
of PC*. 

(c) If « is a theorem of PC* then ~ « = $ is analytically undecidable, 
i.e. no coherent belief system can contain either T( ~ « =) or 
F( ~a =>). 

(d) If « and « = $ are theorems of PC*, then £ is a theorem of PC*. 

(e) If « and « -> f are theorems of PC*, then £ is a theorem of PC*. 

(£) If (a) and « = £8 are theorems of PC*, then 4(§) is a theorem of PC*, 
where ‘¢(f)’ is obtained from ‘¢(«)’ by replacing any occurrence of ‘a’ 
with ‘R. 

Some special theorems relating to ‘=’ are the following: 

(1) p =p 

(2) (PA =H) =(6A9) 
(3)(~gr(>9)>~P 

(4) @>9) (P49 >=” > (6 >”) 

O> uar) > =”) 

6) (>) ^ (p =r) > =g ^r) 

(7) (b> 9) v (2 >) > = @v ») 

8) =) 44>?) >(@>7 >@ +7) 
9) ~ == (p= ~a) 

(10) p >(v ~g). 

To demonstrate that any of these propositions is a tautology it is sufficient 

to show that given that the variables range over semi-decidable propositions, 

the proposition can coherently be accepted as empirically true, but cannot 
coherently be accepted as empirically false. To illustrate the method that 
is here being employed, let us derive (3) and (4). ; 

Derivation of (3): Since T( ~ p) is coherent and any coherent belief 
system that contains F ~ p must also contain T(q > ~ p), we have that 

(3) can rationally be accepted as empirically true. It remains to be shown . 
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that (3) cannot rationally be accepted as empirically false. This is demon- 
strated by the fact that in the following tableau all paths are closed: 


F(( ~g 4 (p> 9) > ~p) 


T(~q (p>) 
F( ~p) ` 


Derivation of (4): To demonstrate that (4) can be accepted as empirically 
true, it is sufficient to show that (4)’ can be accepted as empirically true, 
where (4)’ is obtained from (4) by any uniform substitution of propositional 
variables or their negations for the propositional variables in (4), assuming 
at the same time that the propositional variables designate logically 
independent propositions. If the consequent of (4)’.is of the form p > 1, 
p>~1,p>gqp > ~q, then by the assumptions of logical independence 
and semi-decidability the consequent can be accepted as empirically true, 
and hence (4)’ can be so accepted. If the consequent is of the form p => p, 
then by semi-decidability (4)’ must be accepted as empirically true. If the 
consequent is of the form p > ~ p, then there are the following cases to 
consider: : 

(a) (6 = 9) a Ag) > ~P)) > @ = ~?) 

(b) (E =p) A (P AD) > ~2)) 3 ( > ~8) 

(c) (> ~p) (HA ~p) > ~P) > (> ~?P) 

In case (a), ((p A q) > ~p). must be accepted as empirically false. 
In case (b), ((p A p) > ~ p) must be accepted as empirically false, and 
in case (c), (p = ~ p) must be accepted as empirically false. Hence in 
every case, (4)’ can be accepted as empirically true, and hence (4) can also be. 

It is not difficult to demonstrate that (4) cannot coherently be accepted 
as empirically false. From the tableaux (8), (9) and (11) we obtain: 


(14) Ta =£ N( ~ a), T(B), T[la]« 
à —F(«), T[«]£ . 
; ——X(«), X(8), T[«]8, T[ ~f] ~a 
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(15) F(« > P} Te), FP) 
and from (8) and (12) we get 
(16) F(a > B}—_—___N ~ a), FB), Tiele 

F(a), Flip 

—— X (a), X(B), F[al8, F[B]e 
Using these rules and the rule T[a]8, T[« ^ Bly -> T[a]y it can be shown 
by tableaux analysis that every branch of the tree for F(((p =q) A 
((p ^ g) =r)) > (p = r)) closes, and hence that (4) cannot coherently 
be accepted as empirically false. The full analysis is too cumbersome to 
reproduce here, but those who are familiar with this method should not 
have much difficulty in proving this. The proposition (4) is therefore 
tautologous in the required sense. 

The proof of (1) is obvious. That of (2) depends upon the following 

derived tableau: ` 
(17) F(a = 8}— F(a), T(8) 

——T (a), F(8) 

———X(a), X(B), TI ~ alf, TI ~ Bla, FlolB, FI8]x. 
It is really only.necessary, however, to consider the first two branches. 
For if it can be shown that these both close, then it follows that there 
cannot be a coherent belief system that includes T[ ~ «J8 or T[ ~ Bla, 
and hence the third branch must also be considered to be closed. The 
proof that the first two branches close is elementary. (5) follows almost 
immediately using the inference rule T(« = £) -> T[a]8. Of the remainder, 
(7) is the most difficult to establish. I have argued meta-logically that 
p => (q ^ r) is the propositional disjunction of p > g and p =r. But to 
complete the proof of tautologousness we need the rule: 

Tla a Byy——> Tfa 
eT rhb 
In words, this is the claim that there is a coherent belief system in which 
T(« v f) and T(y) occur that is nearer to a given coherent belief system B 
than any in which T(« A f) and either X(y) or F(y) occurs only if there 
is a coherent belief system in which either T(«) or T(8), and T(y) occur 
that is nearer to B than any in which T(«) or T(8), as the case may be, 
occurs, but T(y) does not. This is certainly a plausible thesis, but its 
acceptance implies that when we are considering the case of a hypo- 
thetical belief system in which T(« v £) occurs we cannot be envisaging a 
belief system in which X(a), X(8), T[ ~ «]8, T[ ~ Ple occurs. 
It should be pointed out that the theorems of PC® are valid only under 

a substitution rule which allows the uniform substitution of propositional 
variables or their negations for propositional variables. Under a general . 
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substitution rule of wffs. of PC* for propositional variables there would be 
no theorems at all. For such wffs. as (p A ~p) =g are analytically 
undecidable, #.e. a statement of this form cannot coherently be accepted as 
empirically true or false. Hence under such a substitution rule even 
‘pv ~ p would not be tautologous. For the proposition ((p A ~ p) =q) 
v ~((p A ~p) =q) which is of the form ‘a v ~ a’ is one that cannot 
coherently be accepted as empirically true or false. 

About the corresponding probability system not much can be said, 
except that if ‘a’ is a theorem of PC*, then ‘P(«) = 1’ must be a theorem of 
PRC*. We have seen that this system cannot be derived as I had originally 
hoped by means of the Dutch book argument, and I cannot at present see 
any other way of deriving such a system. But until such a system has been 
derived, it is clear that we do not have an adequate logic of subjective 
probability claims. 


7 CONCLUSION AND OUTLOOK 


The main problem with which I have been concerned in this paper 

remains unsolved. We have as yet no adequate logic of subjective proba- 

bility. The classical probability calculus is not an adequate logic of 

subjective probability because 

(a) it is not capable of handling subjective probability claims concerning 
subjunctive conditions, and 

(b) it is not strong enough to deal with compound conditionals. 

A simple way of dealing with the first of these problems might be to use 

the theory of infinitesimals, and to make the assumption that no contingent 

proposition can be assigned a probability of 1 or o. The claim that p is 

contingently true might then be represented by ‘P(p) = 1—e where e 

is an infinitesimal. But this would do nothing to solve the problem of 

compound conditionals. 

Several attempts have been made to extend the probability calculus 
so that it can deal with compound conditionals, the best known of which 
is R. C. Stalnaker’s [1970]. In an unpublished manuscript entitled 
‘Epistemological Foundations of Logic’ I have myself made several 
attempts to construct such a calculus. But all of these, to my knowledge, 
fail because they contain the theorem that whenever P(p ^ q) #0, 
P(p > (q > 1r)) = P((p A q) = r), and as David Lewis has demonstrated # 
any extended probability calculus that contains this theorem together with 
P(p) >0, P v ~p) = 1, P(p) = Pp Ag+ Pp A ~a), P AQ) = 
P(p)x P(p => q), and allows the substitution of tee equivalents of PC 
ig at most four-valued. 


. 1 Stalnaker [1970]. 2 In a private communication dated June 2nd, 1972. 
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Nevertheless, I think I have made some progress towards a solution to 
the main problem, first by demonstrating that certain approaches to it 
cannot succeed, and secondly by showing how the subjectivists’ definition 
of validity might be used to determine the class of analytically certain 
propositions, and hence the set of theorems of PRC* of the form ‘P(«) = r. 
It has been demonstrated that rational fair-betting quotients for semi- 
decidable propositions are not probability measures. And since, in most 
cases, it is clear that conditional propositions are semi-decidable, it 
follows that the Dutch book argument, based on the concept of strict 
coherence, cannot be used to derive the required augmented probability 
calculus. However, it does not follow from this that the subjectivists’ 
concept of validity cannot be used to determine the class of analytically 
certain propositions. On the contrary, it turns out that the various proposi- 
tional connectives and operators including conditionalisation can be 
defined in terms of acceptability conditions, and that when they are so 
defined an augmented propositional calculus that includes PC is readily 
derivable. And, by the logical correspondence principle, the theorems of 
this calculus must all be propositional formulae, all instances of which are 
analytically certain propositions. 

The theory of conditionals offered in this paper obviously owes much to 
the work of Stalnaker and Lewis.1 But unlike them, I have no time for 
possible-worlds analyses. I do not believe, as Lewis does, that there 
exists a set of possible worlds, of which the actual world is one, and which 
is distinguished from other possible worlds only by the fact that it is we 
rather than our counterparts who inhabit it. I do not have to swallow this 
enormous metaphysical pill to get my system going. Nor do I incur any 
of the problems of cross-possible world identity which both Stalnaker and 
Lewis have to grapple with. All I need is different possible belief-systems 
about the actual world. And since we have at least 3,000 million of them, 
I see no problem about considering the possibility of others. I do need an 
epistemological ‘nearness’ relationship, much as Stalnaker and Lewis need 
a similarity relationship on possible worlds. But while we all have a great 
deal of experience in comparing belief systems we have no such experience 
in comparing other possible worlds with our own, and in the end, it is 
the epistemological relationship of nearness which must be used to deter- 
mine the supposed ontological relationship of similarity. 

It has been demonstrated that if the propositional connectives and 
operators are defined in terms of acceptability conditions, a calculus PC* 
which includes PC can be derived. And while there is much work yet to 
be done on acceptability conditions and acceptability-analysis, it already 

1 Stalnaker [1968], Stalnaker and Thomason [1970] and Lewis [1973]. 
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Discussions 


MUST THE LOGICAL PROBABILITY OF LAWS BE ZERO? * 


It has been known for some time that it is possible to define a real-valued 
function u on the Lindenbaum (sentence) algebra of a countable first order 
language L such that 
(t) p is non-negative 

(i) K v By) = 1 

(i) if + @ AB) then afla v b) = p([a)]+4({B]) and 

(io) p is strictly positive i.e. if u([a]) = o then H a—>(b A b) 
where a and b are any sentences of L. This remains possible, moreover, when 
the axioms of L are augmented with an ‘axiom of infinity’ 1 (so that the notion 
of contradiction is extended to include sentences true in domains of at most n 
elements).? Now any function satisfying (2), (#) and (#4) is formally a probability- 
measure, and probability-measures which in addition satisfy (tv) have actually 
been used in investigations of probabilistic inductive logic, notably by Hintikka. 
Yet in Appendix *v# of his [1934], Popper claims to have proved (in more than 
one way) that where p(x) denotes the logical probability of some sentence x, 

pa) =0 (2) 

where a is a contingent universal sentence (interpreted in an at least denumerable 
domain); from which follows that if p(b) > o, then p(a/b) = o. Can the demon- 
stration of the existence of measures which do not satisfy (1) be taken to be a 
counter-example to Popper’s thesis? 

I contend that they can. The latter part of this paper will argue directly for 
this claim, while in the former I shall attempt to rebut Popper’s arguments for 
(1). The virtue of this strategy is obvious enough; for it is clear that by imposing 
more stringent conditions one decreases the admissible measures on a given 
algebra, and it may be that Popper’s arguments disclose a class of acceptably 


* I am grateful to Donald Gillies for reading and criticising a previous version of this 


paper. . 
1 The emptiest of which is represented by the sequence 
Fro.. dal A % t yj; n= 1,2,.... 
<jen 


2 See, for example, Horn and Tarski [1948], Theorem 2.5. The Lindenbaum algebra of a 
first order theory in a countable language is a separable Boolean algebra in the sense of 
this theorem which demonstrates the existence of a strictly positive measure on any 
such algebra. 

® One need go no further afield, in fact, than Carnap’s continuum (Carnap [1952)). It is 
well-known that for à > o the measures m, of his continuum of inductive methods 
assign zero measure to sentences Wx/'x where F has the logical width w and w < K (the 
number of Q-predicates in a monadic language with finitely many primitive predicates), 
in the limit as the number n of individuals increases without bound. But m,(W«Fx) 
==10/K and so is independent of n; so A = o also furnishes a possible counterexample to 
(1), albeit a rather odd one: m(3xF'x) = w/K as well, and, for example, m,(Q; 141 A 4,4) 
= o, Catnap himself rejécts m, because it is not regular (ibid., p. 42) though he does not 
entirely rule out c-functions disobeying (1). (Cf. Carnap [1963], p. 977.) 
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‘logical’ principles relative to which the strictly positive measures, or any not 
satisfying (1), are inadmissible as measures of logical probability. 

My claim, which I shall try to substantiate in the first part of the paper, is 
that none of Popper’s ‘proofs’ actually succeeds in proving (1). However, merely 
to point out fallacies is not very profitable, and a demonstration of formal 
invalidity may very well point to nothing more serious than carelessness in 
proof-construction and the attendant possibility that formal validity can be 
restored by the addition of premisses which—for the case in question—do 
incorporate such acceptably logical principles.t It will, I trust, become clear 
that Popper’s arguments are not of this character, t.e. they are not so reparable— 
they are radically unsound. It must also be admitted that Popper’s arguments 
for (1) are informal, and that he nowhere in the volume cited gives a precise 
characterisation of ‘logical probability’. Nevertheless, despite their informality, 
his arguments are most certainly criticisable in the way indicated, and it is 
not actually necessary to define the term ‘logical probability’ to get to grips with 
them, for the exhibition of their invalidity does not depend on a precise charac- 
terisation. 

Popper’s proofs of (1) are essentially three in number. I shall call them (a) 
the argument from independence; (b) the argument against Jeffreys; and (c) the 
‘dimensional’ argument. 


(a) The argument from independence. This is introduced by a long discussion of 
the so-called ‘classical’ definition of probability? A number of arguments for 
(1) based on this definition are put forward, but after almost four pages occurs 
the remark ‘So far our considerations were based on the classical definition of 
probability. But we arrive at the same result if instead we adopt as our basis 
the logical interpretation of the formal calculus of probability.’ 3 But if these 
interpretations are distinct, and (x), we are told, is a thesis about the logical 
interpretation, then the reader is entitled to ask why so much space is expended 
upon arguments which are confessedly irrelevant. 

The answer lies in the fact that there is an intimate connection between one 
application of the ‘classical’ definition and (probabilistic) independence. As this 
is quite well-known I shall not dwell on it except to point out that if we interpret 
a universal hypothesis YxFx in a domain of power n, which is simply-ordered 
according to some method, then we might characterise possible worlds as 
n-termed conjunctions U; = 4, A uj, A... A U where u, = Fe; or Fe; 
I <i <2”, 1 <j <n. If we regard alf these as equally possible, then according 
to Laplace’s definition each is assigned the measure 2~*. It follows that the 
measure assigned to any disjunction E of the u; is just that which Æ would receive 
if the elements u, of the u; were independent with probability 4. For if p(u,,) = 4 
and the u, are independent then p(u,) = 27". Conversely, if for all 1 <i <2", 


+ In such cases one can bring to light the hidden premises which are latent in the proof, 
and by incorporating them as antecedents into the original theorem (or, rather, ‘naive 
conjecture’), one can arrive at an improved, ‘proof-generated’ theorem. (For a discussion 
of this heuristic pattern cp. Lakatos [1963-4].) Alas, Popper’s proofs have no such 
hidden depths. 
*, Occurring first, as a definition, in the Essai philosophique sur.les probabilités of Laplace 
([1812]), which forms the introduction to his Théorie Analytique des Probabilités. It is, 
however, a consequence of the principle of indifference. 2 Popper [1959], p. 367. 
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p(u,) = 2 then for each, p(Fe) = $ p(u,)=2"1.2-* = 4 = p(Fe)), and 
Fo; in w 


the elements u; are independent, for p(u,) = 2-* = Il pu). (As Popper 
y= 


mentions, another application of the ‘classical’ definition fails to give indepen- 
dence. If the equipossible worlds are those corresponding only to the possible 
proportions of Fs, viz. o, 1/n, ..., 1 then the measure on the u; is not a product 
measure. Carnap’s m* corresponds to the latter, and his mf to the former case.) 
According to Popper (p. 365), one cannot simply apply the classical definition 
in a proof of (1), for it assumes, as we saw, that F and F are equally probable, 
for which there is no clear justification. But what, apparently, is ‘essentially 
correct’ in it is the implied independence of the u,, (cf. p. 365), and it is this 
contention that we shall scrutinise. 


Before leaving the ‘classical’ definition, let us look briefly at Popper’s 
account of statistical hypotheses relative to a ‘classical’ distribution of 
probabilities over them. Now Popper points out that the universal sentence 
‘All A is B’ entails a weaker statistical hypothesis (P(B/A) =r (I shall 
use a capital P to denote the physical probability measure and for the 
sake of argument accept Popper’s propensity characterisation of it; A and 
B will denote events.) Now for Popper!, a statistical statement P(B/A) = 
r+8 is assigned ‘with the help of Laplace’s distribution’? the logical 
probability 25 (clearly 4 is an upper bound on ô). Hypotheses of the form 
P(B/A) = 1, therefore, have zero probability and consequently so do all 
universal hypotheses ‘All A is B’. Obviously the premiss of a uniform 
distribution may be denied without contradiction or, as far as Popper’s 
argument for it goes, without unnaturalness (Popper actually does argue for 
it? invoking the principle that ‘the informative content [and hence, for 
Popper, the improbability] of a statement increases with its precision’ 
(which is a function of 5). But it does not follow from this that the relation- 
ship must be of the form y = mx, where o <x <1/m. But consider the 
following argument. Suppose that we wish to assign a positive measure 
to P(B/A) = r; then we can do this consistently only for at most countably 
many values of r (for there are at most n disjoint events with probabilities 
> 1/n). Therefore, since we cannot plausibly select only countably many 
preferred values by any logical principle, we should put all these hypotheses 
on the same footing of zero probability (and presumably would do well 
to adopt the uniform distribution p(P(B/A) <r) = r). But these con- 
clusions are obtained only with the aid of what Keyes termed the principle 
of indifference, namely ‘if there is no reason to suppose that events (or 
propositions) differ in probability, infer that they don’t’. Taken generally, 
this principle is notoriously inconsistent*, though more to the point it has 
no obvious credentials to be a foundation of a theory of logical probability. 
(One might argue in addition that we cannot, by a simple application of 

- Cantor’s diagonal argument, name in any natural language all the real 
1 Ibid., p. 411. i 
? Though in fact the assumption of a uniform prior distribution over the closed unit, 
interval was due first to Bayes, and frequently goes under the name of Bayea’s Postulate. 
3 Ibid., pp. 410-77. t Keynes [1921], pp. 41-52. 
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numbers in any non-degenerate interval and consequently we can formulate 
only a countable set of sentences P(B/A) = r, to each of which a positive 
probability may be assigned—though not by a uniform distribution. 
However, I do not think that this is the fundamental objection to the 
argument.) 

Let us now look at the arguments from independence proper. Popper’s argu- 
ment here consists of two parts. One is an entirely trivial demonstration assuming 
independence, while the second (which is the one I want to criticise) attempts to 
show that the u,, must be independent. Now clearly, for any non-extreme assign- 
ments to the elements u;,„ which may depend on the index j of the element in 
the logical product, independence is a sufficient condition for p(¥xF'x) = o in 
the limit as the size of the universe tends to infinity. This is the argument 
actually from independence and is indisputable. Why should the u,, be indepen- 
dent? First, ‘every other assumption [than independence] would amount to 
postulating ad hoc a kind of after-effect’4, or else it would have ‘the character 
of a synthetic a priori principle of induction, rather than of a... logical asser- 
tion’?, And such a principle ‘should be formulated as a hypothesis. It .. . cannot 
form part of a purely logical theory of probability’*. The assumptions that serve 
to link, for Popper, formulas of the formal calculus of probability with ‘synthetic 
a priori’ statements or ‘inductive principles’ are nowhere evident and I do not 
at this point wish to emphasise them, though I think that this covert identification 
of logical probabilities with quantities of methodological significance deserves a 
very careful scrutiny indeed, and I shall return briefly to this point at the end of 
this paper. If it is the case that Popper is only accepting here what is taken for 
` granted by ‘my inductivist opponents’ for the purpose of exhibiting inconsis- 
tencies in their programme, then in the first place this is nowhere indicated in 
the text, and in the second, as will be seen, allowing for this possibility does 
not mitigate inconsistency in Popper’s own position. Let us grant these assump- 
tions, whatever they might be, and continue with Popper’s argument. 

From the points made in the preceding quotations, Popper concludes that 
‘... the only reasonable assumption seems to be that... we must consider [the 
elements u;,] as mutually independent of one another’, and again, ‘.. . if we are 
concerned with absolute logical probabilities then p(a,a,) = paola). 5 (Inci- 
dentally, Popper’s equations entail only that the elements are pairwise indepen- 
dent: it does not follow that they are independent in triples, n-tuples etc.®) 
Most of the remaining material in this part of Popper’s appendix is merely 
elaboration of this argument, and Popper transfers to this context the remark of 
David Hume, that ‘Adam could not so much as prove by any probable arguments 
that the future must be conformable to the past. All probable arguments are 
built on the supposition that there is conformity between the future and the past, 
and can never therefore prove it’? This argument is a succinct version of 
Popper’s (with the difference only that what Hume calls ‘probability’ Popper 
calls ‘logical probability’®: it is tantamount to the assertion that what has come 
to be called the principle of positive instantial relevance is not a theorem of 
1 Popper [1959], p.-367. a Ibid., p. 370. > Ibid., 

* Ibid., p. 367. 5 Ibid., p. 368. e Cf. TANAR [1946], p. 11. 

«7 Quoted in Popper, op. cit., p. 369. 

8 At least, one assumes that this must be the case: if not, it is difficult to see the relevance 
of Popper’s quoting Hume at all. 


Must the Logical Probability of Laws be Zero? 157 


logic, in this—vague—interpretation of the probability function), except that 
Hume did not conclude that the only measure on the u, must be a product 
measure. But it is quite clear that by Popper’s own reasoning the assump- 
tion of independence must suffer the same fate as that of dependence: 
it is equally incongruous in this ‘purely logical theory of probability’; for it 
postulates equally ad hoc a lack of after-effect, and it is equally paraphrasable 
by a ‘synthetic a priori principle.’ It also does not have ‘the character of a 
tautology, valid in every possible universe’. Thus Popper’s conclusion is incon- 
sistent with his premisses. But what is remarkable about his reasoning is that 
it seems to prove that there is no such thing as logical probability, for any measure 
will create either dependence or independence among the u,,. I think that this 
conclusion is indeed correct for any theory of logical probability which both 
intends all its consequences to be logico-mathematical truths and which permits 
Popper’s construction of the term ‘logical probability’ (this is rather vague, but 
let us merely endow it with the properties Popper himself attributes to it, such 
as they are). But this conclusion does not apply (or rather is would have to be 
shown to apply by arguments different from Popper’s) to a theory of logical 
probability which takes as its point of departure the currently accepted theory of 
logical consequence that goes back to Bolzano, and seeks to generalise this 
relation to one of partial entailment. Such a theory would be logically indepen- 
dent of ‘synthetic a priori principles’ (except insofar as contemporary deductive 
logic relies on some system of set theory), and none of Popper’s arguments i in 
this section impinges on it.) 

Popper’s insistence, it can further be said, on the independence of the u, 
combined with his Humean construction of the term ‘logical probability’ may 
well add up to a position of agnosticism in the face of sample evidence, but it 
does so at the cost of almost total credulity about some aspects of the future, in 
the absence of all evidence whatever. For it is a consequence of a very general 
theorem known as the central limit theorem that the relative frequency of 
successes in a sequence of outcomes which are simple alternatives converges in 


probability to the quantity p(n) = $ pln where p, is the probability of success 
i= 


at the i-th outcome, given only that the outcomes are independent. Thus the 
‘probability of the sentence ‘the relative frequency of heads in n tosses of any 
arbitrary coins is confined within the interval (p(n)—e, p(n)+e} where e is an 
arbitrarily small positive constant, can be made as close to 1 as one like by 
taking » sufficiently large. But if one is permitted to use a uniform prior dis- 
tribution over the physical probabilities of one given coin (“Laplace’s distri- 
bution’ again), then for small e and large n it is almost ‘almost certain’ that the 
actual long-run relative frequency of heads on that coin will lie outside such an 
interval. Whether one is disposed to regard this seriously or not, the fact remains 
1 At least from the time of the Cartesian Port Royal Logic (1662), until the present century, 
it was customary to regard logic as, roughly, the theory of rational belief, comprising a 
mixture of psychology, syllogistic and what we now distinguish as methodology. Within 
this theory an epistemic interpretation of probability was developed, and became, before 
the work of Wittgenstein, in this interpretation the foundation of the early accounts of 
probabilistic inductive logic to be published this century. The doctrine is set out at its , 
clearest in the Introductién to W. E. Johnson’s Logic. Popper’s treatment, with its 
apparently other-than-logical account of logical probability, seems to hearken back to 
this tradition. 
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that independence (combined with long sequences of trials) is a standard con- 
dition to impose in order to approximate the extreme probability-values; and 
where the probability in question is logical, and construed in.a manner indicative 
of the extent to which it is rational to assent to some proposition, it ushers in a 
rather extreme degree of a priorism. 

(b) The argument against Jeffreys. Actually Popper has more than one argument 
against Jeffrey’s theory of probability; this one however aims directly at estab- 
lishing (1) by a purely technical argument, and takes as its point of departure 
some remarks of Sir Harold Jeffreys in his [1939]. First we need some notation. 
Following Popper, we can suppose a to be a theory entailing denumerably many 
instantiations 5,, b,,..., &,,..- 





Let 
br = ba A bg A... A, 

Then 

i p(b,Ja) = 1 for all i, 
and 

p(b"/a) = 1 for all n. 
Consequently 
= 20) 
palen) = es @) 


Now p(b"’) = Ú (5,/b4) where b, is a given tautology. 


Now ie we”) and since o < p(b") <1 it follows that lin p(b") exists, ` 
and hence that i jii p(b,/b*) exists. It follows that ie pbe") = = 1, or 


pb”) = o for some w, “ond hence for all n > n’. But if Ha > o and p(b") = o 
for some n, then plaji) = œ for i >n, which is impossible. Consequently 
we may infer that either p(a) = o or p(n /b"-1) + 1. Jeffreys plumped for the | 
second alternative. Popper contends that in so doing he was wrong, and that 
the thesis p(a) = o is denied on pain of contradiction: These reasons of Popper’s 
I shall paraphrase as follows. A sufficient condition for the equality (2) is that- 
+t a— b; for all i <n. The assumption that p(4,/b*-) -> 1 universally, Popper 
claims, leads us into a contradiction. For we may construct, for any law-statement 
a, a Goodmanesque analogue a, (for each n) which asserts that the first n— 1 indivi- 
duals (relative to some ordering) obey the law a, while all subsequent individuals 
exemplify a property disjoint from that asserted in a. Then, says Popper, if we 
hold that p(4,/6*-1) > x generally, we must ‘obtain 9(4,/6"-") close to 1, and 
also (from another law a’[my a,])p(4,/b"-2) close to 1. Accordingly Jeffreys’s 
argument [that p(a)==o or in plab") = 1], which is- mathematically 


inescapable, can be used to prove [that pla) = eke 


1 Jeffreys [1939], section 1.6. 

2 Note that Popper’s argument does not demonstrate that p(a)=o for a E a. It 
is an implausibility argument against setting p(a)>o that depends for its plausibility 
upon our conceding that the genera] probabilistic properties of some a and some se- 
quence b, should not depend on the particular a and sequence b;. While this is ques- 
tionable, I shall concede it and show that his proposed reductio still does not follow. 
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But this argument of Popper’s does not hold water at all. Let us reconsider 
his argument with a slightly improved notation. Define 


Cy, = bi foro <i < k 
Cy =b fork <i 
Further, let us assume that it is possible to construct a sequence of law-like 


statements az such that Y; + a, —> ¢,, and a, = a, and let p(a,) > o for every k. 
It follows by the considerations above that lim p(c,4/c*-)*) = 1.1 Let fk) = 
n 


P(c] *). Clearly we have a situation exactly as Popper supposes, involving 
a law a and its Goodmanesque analogues for all indices k = 1,2,...,,.... 
We are now able to supply the crucial premiss that Popper’s reductio lacks. It is 
that the sequence of functions f,(k) converges uniformly to x over the set k > o. 
Then we should have INVn > NVR p(c,,/c*-*) > $ and then we can simul- 
taneously choose no >N such that p(h,,/b6")> 4 and k=, so that 
P{Enonp[-D™) > 4, ie. p(b,,/b"°-2) > $ which is impossible. Therefore we can 
infer that p(a,) > o entails that the convergence of the f{k) is not uniform, 
nothing more. In particular, Popper has not shown that ‘there will always be 
predicates A and B which both apply to all things so far observed, but lead to 
incompatible probabilistic predictions with respect to the next thing.’ 


(© The ‘dimensional’ argument. The full statement of this argument occurs? in a 
discussion of the possibility (advanced by Jeffreys and Wrinch) of assigning 
probabilities to theories in such a way that the ordering induced by these values 
should coincide with their ordering by simplicity, where theory T, is simpler 
than T; if T; has fewer adjustable parameters in its equations than T}. Consider 
the following set of polynomial theories: 

To Y= 4% 

Ty Y= ataw 


Ta Y= @Q+tayx+ ... +a,x* 


If we stipulate that all the coefficients a, are positive then all the T; are mutually 
exclusive (—œ <x < œ), and one could, one might think, assign them all 
positive probabilities in nach a way that HT) > p(T;4,,) for m=1,2,..., 


and limi (Ta) = 0 (since > (Ti) < 1). Indeed, there are infinitely many ways 


of doing this. But decane to Popper, such assignments ‘contradict the laws 
of the calculus of probability’.4 This surprising assertion depends on a lemma 
that purports to show that the logical (prior) probability varies in the same 
direction as the theory’s ‘dimension’ (a magnitude that is equal to the number 
of adjustable parameters). Obviously positive assignments to any of the T; 


n 
are impossible if this is true, for we should have lim > #(T;) = oo. How then is 
n i 


this lemma justified? Well, we can regard points in a suitable portion of d- 
dimensional space as determining the parameters of a theory whose dimension 


1 th e cyk Ñ coh... N Cika * Op. cit., p. 371. 
3 Ibid., pp. 382-3. * Op. cit., p. 381. 
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d. is Popper’s argument (which is given in detail only in the succeeding new 
appendix *viiz) is that each ‘cube’ of side e in Euclidean space of d dimensions 
R¢ represents a possibility ‘favourable’ to a theory of dimension d ‘and the d- 
dimensional arrangement will represent the set of all “quasi curves” compatible 
with, or favourable to, the theory.’ He then goes on to say that the theory 
with fewer parameters ‘will also contain a smaller number of “cubes”; that is, 
of favourable possibilities.’* This is not very clear, but the argument is that 
any region of Rî contains a larger number of these ‘cubes’ of side e, for small e, 
than its projection in R* where k < d. Hence there are more favourable possi- 
bilities for a theory of d dimensions than for one of k. But this is, in the first 
place, an argument from the ‘classical’ theory of probability, to which Popper 
conceded there was no legitimate appeal for a theory of logical probability; 
but even granting the classical theory, the argument is invalid, and palpably so.? 
For each point in the ‘parameter space’ R? completely specifies a theory with d 
free parameters, and is no more ‘favourable’ than any other; hence if the ‘sample 
space’ of equipossibilities relative to which the classical account proceeds is R? 
(and it is impossible to see what else it could be) then all possibilities are just 
all the favourable possibilities. So the ratio of all favourable to all possibilities 
is the same relative to R? as it is to R*. To sum up, this appears to me to be the 
weakest of Popper’s arguments, and yet it is used to demonstrate the contro- 
versial formula 


not—[p(/) < p(ha) & d(h,) > d(ha)] * (3) 
whose negation, pace Popper, most certainly does not ‘contradict the laws of 
the calculus of probability’. 


But, it may be objected, is it not the case that a theory with n free parameters 
is less specific than one with k, k < n, and hence less easily falsified?—hence its 
content is greater and its probability less. Then, on this (quite general) argument, 
the negation of (3) would after all be inconsistent, for it is a theorem of (1)-(##1) 
(above) that if the content of b does not exceed that of a, i.e. if F ab, then 
p([a]) < p([b]). An argument of this type does in fact appear as a reinforcement 
of Popper’s argument in the preceding paragraph.5 But it is not successful. For 
it is quite consistent with the axioms of probability for one theory to have more 
free parameters than another, and yet to be less probable (for the sequence 
T; of polynomial theories on p. 159 above can easily be constructed within a first 
order language, and positive measures—as we have seen—can then be assigned 
to each. And this is sufficient to refute Popper’s (3)). It is curious that Popper 
implicitly concedes this when he admits that his theory of dimension is intended to 
provide a method of comparing theories that are incommensurable relative to an 
ordering by range-inclusion (for only in cases where all the models of a are also 
models of b do the axioms of probability pronounce); and he says only that ‘the 
strength or content of a... can be measured by d(a)+-1’8 (my italics); and this 
measure does not entail, contrary to his less tentative remarks, his formula (3). 

This concludes my case that Popper’s arguments ought not to constrain us to 


1 Ibid., p. 381. s 2 Ibid., p. 381. 

3 Though what follows relates only to Popper’s argument. It certainly can be argued that 
relative to a random distribution of observations the probability of a hypothesis varies 
directly with the number of its free parameters. 

. * Ibid., p. 381. 5 Ibid., pp. 381-2. * Ibid., p. 382. 
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accept (x). But still (1) might be true. I do not think so, for the following reasons 
(no new results, by the way, are produced here, only some quite familiar facts). 
Let us agree to define a finite measure p on the algebra of classes of models 
(called elementary classes) of sentences of some first order language L with 
equality (which is assumed countable).1 Then for fixed B, (X A B) is also a 
finite measure on the same field, and ug(X) defined by u(X N B)/u(B), for 
p(B) > o is, for every B, a normalised measure which looks like a good formal 
explication of what one means, where b is the sentence corresponding to B, 
when one talks of the extent to which b entails other sentences of L (for clearly 
jeg) is just the proportion, relative to u, of b’s models which are also models 
of the sentence x). Evidently, p(x|y) defined by zy(X), satisfies the usual axioms 
of the probability calculus.? But the trouble with all this is that, while it may 
give a good explication? of partial entailment, based on the usual semantic 
notion of consequence, it pretty well exhausts that notion, leaving one with 
scant indication of how one ought to choose from among the uncountably 
many measures available. 

Indeed, one can say quite plausibly that the theory of conditional probabilities 
(at least, in the present context) really boils down to choosing the basic measure 
p and that therefore there is no way of developing a theory of logical probability 
without facing this task. Clearly, where t is any tautology, and p(X) is defined 
as p(x/t) we can represent the function p({x/y) as (X A Y)/z(Y) for all values 
of x and all y such that p(y/t) > of (conditioning on contradictions is of course 


1 Though the membership of these classes must be assumed to be limited: for example, 
the full universe of sets is a model of a finite consistent subset of Zermelo-Fraenkel set 
theory, yet is itself a proper class. 

* Thus a may be deducible from b modulo u—t.e. p(a/b) may be 1—even though a and b 
are strictly logically independent. The measure used may not ‘notice’ the models of 
b that do not satisfy a, setting the proportion of those that do at 1: b’s models are models 
of a ‘almost everywhere’ (a strictly positive measure would make ‘almost everywhere’ 
equivalent to ‘everywhere’). 

‘The measure u, which normalised might be termed a prior probability (though in this 
context the term is inappropriate) is continuous, by a well known argument from the 
compactness of L. Hence by a standard method it can be shown to have a unique 
countably additive extension on the o-field generated by the elementary classes. Then 
where & is an arbitrary set of sentences and M(a) the class of models of a 


pM (2) = lim pM(s, A 1A... A Sa); E = {54, 8a o o os Sng o e ob 


Thus it is possible to consider aM(Z) where Z is an infinite set of sentences in L (it 
will anyway be countable). However we know that for some (of course finite) conjunction 
a of elements of X, if E | x, then a | x. It seems entirely possible, however, that there 
are X, x, such that pa(zy(X) + umya)(X) for all such conjunctions a. 

Note that up where »(B) = o is not defined. There seems no way round this (though 
this is by no means a necessary consequence for probability spaces where there is a 
family of naturally arising random variables. (See Kolmogorov [1933], pp. 47-52 for the 
original idea, which has now become a standard method.) 

3 Though in view of the recursive undecidability of a large number of (even finitely 
axiomatisable) first order theories, of limited applicability (even supposing it were 
otherwise satisfactory: that it is not is the burden of the next couple of paragraphs.) 

t Where the class of admissible second arguments does not include valid sentences the 
situation is more complicated. However, why the set of second arguments should be 
discontinuous in this way is not apparent and no theory of logical probability extant 
makes this stipulation. .Sufficient conditions for a countably additive conditional 
probability set function (for some given set of second arguments) to be representable 
as a quotient of values of a measure are given in Rényi [1956]. 


. 
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ruled out within the standard formulations of a two-place logical probability 
function, for we should have p(a/a A a) = 1 and p(d/a A d)= 1 and hence 
pla v G/a A ä) = 2). My belief is that no restriction that warrants the title 
‘logical’ can be imposed on these ‘logical’ measures, for the class of logical 
principles is to my mind naturally occupied by just those conditions which 
determine an acceptable consequence relation, and these are just those classified 
according to the usual division of labour among logical axioms and rules of 
inference (though because of the incompleteness of second and higher order 
logic under the unrestricted interpretation of the quantifiers an uncontroversial 
syntactical characterisation is possible only for first order logic), and none of these 
distinguishes any preferred class of measures satisfying (1) above. But in that 
case the only relations of partial deducibility that remain invariant under all 
choices of u are those expressed in the axioms of the probability calculus, and 
the precision afforded by the continuum to provide an ordering of sentences 
based on ‘partial deducibility’ relative to some fixed set of sentences is merely 
spurious. The utility of granting the existence of ‘logical probability’ thus seems 
highly questionable. However, one can rest assured that if we allow its existence 
then there are certainly measures which falsify (1). 

A word about historical antecedents is in order here. This theory, such as it is, 
of logical probability, as an extension of the contemporary notion of entailment 
(though deductive logic is not necessarily fully representable in it: for one thing 
if the measure is not strictly positive, then, as we saw, some fallacious inferences 
——‘fallacious’ in the traditional sense—become indiscernible from sound) 
commences with Wittgenstein, where it is worked out using a simple counting 
measure on the truth table ranges of propositional terms (the notion of logical 
content as determined by the complement of the range of a proposition apparently 
first appears here too)". The idea was taken up by Waismann,? and developed in 
detail for a class of predicate languages by, of course, Carnap, commencing 
with his [1950]. Carnap (as did Waismann) attempts to impose additional 
constraints on the range measure (our p), though at the same time 
manifesting a mild disapproval towards the ‘qualified psychologism’ implicit 
in such notions as ‘rational belief’? and insisting that ‘references to something 
extralogical should not obscure the nature of probability, as a purely logi- 
cal concept’. And yet Carnap informally characterises a logical probability 
function as determining a system of rational betting quotients, and rejects a 
whole class of logical measures as inadequate, because they don’t conform to 
this requirement. Thus, we read that mf is not an ‘adequate explicatum’ of 
logical probability.4 So presumably being the quotient of values of a range 
measure (which, incidentally, according to Carnap, satisfied his criterion of 
being purely logical) is at most a necessary condition for being an adequate 
explicatum of ‘partial Z-implication’.5 This may well block the looming paradox, 
but the position it represents is scarcely more tenable. Thus the requirements of 
regularity and symmetry for his c-functions are justified by Carnap principally 
in terms of principles of rationality, not of logic (though symmetry is further 
argued by an analogy drawn with deductive logic: there the property of beihg 
derivable from logical axioms alone is invariant under permutations of the 


i Wittgenstein [1961], p. 79. 2 Waismann [r930-1]. ° > Carnap [1950], p. 44. 
. * Op. cit., p. 299. š Ibid., p. 297. 
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set of individual constants. But the reason that this is so is precisely because 
such sentences remain semantically valid under these permutations; and 
semantical validity is, it goes without saying, a property peculiar to semantically 
valid sentences. So the analogy breaks down). It seems to me that there are two 
quite distinct and non-coextensive notions conflated here: one constitutes part of 
a theory of rationality, and the other a part of logic. 

These few exegetical remarks afford a convenient opportunity to draw 
together the threads of this paper, which remains primarily about Popper’s 
‘proofs’ of (1). Popper, I suspect, like Carnap, has conflated logical and epistemo- 
logical considerations and (1) is almost certainly one of the most conspicuous 
fruits of this. It seems to me that the reasonable attitude to law sentences in 
infinite domains is indeed to be extremely sceptical of their truth. The un- 
naturalness of any other policy is witnessed by, for example, the fact that to 
obtain a positive measure on such a sentence one would have in Carnapian terms 
(relative to his monadic languages) to increase the average weight of state 
descriptions satisfying it by a factor of the order of a” relative to the average 
weight of all state descriptions, where a is a certain constant greater than 1 and » 
is the size of the universe. This smacks of a certain arbitrariness, to be sure. 
But it is not illogical. 

C. HOWSON 


London School of Economics 
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ON LEARNING FROM OUR MISTAKES* 


x In the Preface to the first edition of Conjectures and Refutations 
written in 1962, Popper characterised his thesis concerning the theory of know- 
ledge and its growth by the slogan ‘We can learn from our mistakes’ (p.vii), and 
three years later, in the Preface to the second edition, he strengthened this bý 
asserting that it is part of his thesis that ‘All our knowledge grows only through 
the correcting of our mistakes’ (p. ix). According to Popper, we learn from our 
mistakes by proposing bold conjectures or guesses as solutions to our problems 
and testing them, and eventually refuting them and in this way learning where 
we were wrong. In his own words! 


I can therefore gladly admit that falsificationists like myself much prefer an 
attempt to solve an interesting problem by a bold conjecture, even (and 
especially) if it soon turns out to be false, to any recital of a sequence of 
irrelevant truisms. We prefer this because we believe that in this way we 
learn from our mistakes; and that in finding that our conjecture was false, 
we shall have learnt much about the truth, and shall have got nearer to the 
truth. 


In this paper the view that we can learn most effectively from our mistakes by 
making bold conjectures and refuting them is challenged. It is argued that the 
Popperian conception of science and its growth leads to the conclusion that 
although we learn from bold speculative conjectures if the conjectures are 
subsequently confirmed, a refuted conjecture is more informative if it is cautious 
and relatively non-speculative rather than bold. We learn effectively from our 
mistakes by being cautious. 

When Popper supports the proposal and testing of bold conjectures in science, 
one sense of the adjective ‘bold’ can be identified with ‘a high degree of falsifi- 
ability’. Since there is another sense to Popper’s use of ‘bold’, to be described 
below, I will refer to this first sense as bold,. The extent to which a theory is 
bold,, then, depends on the size of the class of its potential falsifiers. Since this 
class will be infinite for any interesting theory, it is not possible to define an 
absolute measure of degree of falsifiability and thus of boldness,. But in certain 
special cases, for instance, when the class of potential falsifiers of one theory is a 
subclass of the class of potential falsifiers of another, it is possible to compare 
degrees of boldness,.? 

Lakatos? has emphasised the relative nature of what is here referred to as 
boldness,. According to him, a new theory will be bold if it has ‘excess content’ 
or ‘excess falsifiability’ over ‘the theory it challenges’, which will usually be the 
theory generally accepted, either explicitly or implicitly, prior to the testing of the 
new theory. Lakatos is thus able to stress that ‘one cannot decide whether a 
theory is bold by examining the theory in isolation, but only by examining it in 


+ While working on this paper I received financial support from the University of Sydney 
in the form of a post-doctoral research fellowship. 

t Popper [1969], p. 231. . 

2 For Popper’s detailed account of the comparison of degrees of falsifiability see his 
[1959], chapter 6. * Lakatos [1968], p. 375. 
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its historico-methodological context, against the background of its available 

rivals’.+ 

Boldness, depends on the unlikelihood of the predictions of a theory, on their 
novel and unexpected character, as assessed relative to what Popper calls ‘back- 
ground knowledge’. This background knowledge is comprised of those parts of 
our knowledge which are tentatively accepted, immediately prior to the testing 
of a new theory, as unproblematic.? A theory will certainly be bold, if, in 
Popper’s own words? it predicts ‘new kinds of events (which the physicist calls 
“new effects”) such as the predictions which led to the discovery of wireless 
waves, or of zero point energy or to the artificial building up of new elements not 
previously found in nature’ as opposed to ‘events of a kind which is known, such as 
eclipses, or thunderstorms’. Lakatos captures the spirit of boldness, when he 
writes :4 

A theory is the bolder the more it revolutionizes our previous picture of the 
world; for instance, the more surprisingly it unites fields of knowledge 
previously regarded as distant and unconnected; and even, possibly the 
“more inconsistent” it is with the “data” or with the “laws” it set out to 
explain - - - -. 

A theory, T,, can be appreciably bolder, than a second theory, Tẹ, without 
being significantly bolder, than that theory. For example, the theory which 
claims that Newton’s law of gravitation applies to all bodies is bolder, than a 
theory which makes this claim for macroscopic bodies only. Yet, in our present 
state of knowledge, the more general assertion is not significantly bolder, than 
the more restricted one. This situation would alter, of course, if some considera- 
tion arose to cast doubt on the applicability of the law of gravitation to moleculea 
or parts of molecules. It is also possible for a theory to be bolder, but not bolder, 
than a second theory if boldness, is identified with degree of falsifiability and is 
to be compared by methods outlined by Popper in Chapter 6 of his [1959]. This 
can be illustrated by consideration of the following two laws. 

(¢) All gases (including newly discovered gas X) obey Boyle’s law. 

(#) All gases except X obey Boyle’s law, while gas X obeys the relation 

P?V = a, a constant. 

Although the degrees of falsifiability of these two laws are incomparable accord- 

ing to the subclass relation, they are comparable according to Popper’s criterion 

involving their ‘dimension’. On this criterion the two laws are equally falsifiable. 

The simplest potential falsifiers in each case are comprised of two pairs of 

corresponding pressure and volume readings. The two laws are thus equally 

bold,. But, relative to present knowledge, (#) is certainly bolder, than (ŝi). It is 
not possible for similar situations to arise when two theories are comparable by 
the subclass relation. If a theory A is bolder, than theory B because the class of 
potential falsifiers of B is a proper subclass of the class of potential falsifiers of 

A, then B cannot be bolder, than 4. Any unlikely predictions made by B will 

also be made by A. 

1 Ibid., p. 377. He also turns ‘boldness’ into a relation such that ‘T, may be bolder relative 
to T; and Tj at the same time bolder relative to Tg (Lakatos [1968], p. 379). This is a 
trivial implication of his definition of boldness as excess content rather than Popperian 
absolute content. 2 See, for example, Popper [1969], pp. 238 and 390. 

? Popper [1969], p. 117. t Lakatos [1968], p. 376. 
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If a conjecture is bold, or bold, it can be said, prior to any empirical test, that 
the likelihood of its being refuted is high. It must be counted as an advance, 
then, if, in spite of the unlikelihood of a boldly conjectured, highly falsifiable 
theory standing up to experimental tests, the predictions of the theory turn out 
to be confirmed. And this advance will be the greater the bolder the conjecture.: 
To quote from Popper’s essay “Truth, Rationality and the Growth of 
Knowledge’ :} 


A theory which is not in fact refuted by testing those new and bold and 
improbable predictions to which it gives rise can be said to be corroborated 
by these severe tests. I may remind you in this connection of Galle’s dis- 
covery of Neptune, of Hertz’s discovery of electromagnetic waves, of 
Eddington’s eclipse observations, of Elsasser’s interpretation of Davisson’s 
maxima as interference fringes of de Broglie waves, and of Powell’s 
observations of the first Yukawa mesons. All these discoveries represent 
corroborations by severe tests—by predictions which were highly im- 
probable in the light of previous knowledge (previous to the theory which was 
tested and corroborated). 


Popper gave further emphasis to the importance of experimental confirmations 
as well as refutations later in the same essay. He claimed that ‘if the progress of 
science is to continue, and its rationality not to decline, we need not only success- 
ful refutations, but also positive successes’ and he observed that ‘an unbroken 
sequence of refuted theories would soon leave us bewildered and helpless’.® 
Popper argues that it is only by surviving an independent test® that a theory can 
be said to be an advance on its predecessor. He also insists that it is ‘only through 
... the temporary successes of our theories that we can be reasonably successful 
in attributing our refutations to definite portions of the theoretical maze’.4 

Making bold conjectures which are subsequently confirmed by experiment 
does not constitute learning from ones mistakes but rather learning from ones 
successes. Popper’s own methodology is thus inconsistent with the claim that we 
learn only from our mistakes. This is not a particularly important point, of course, 
since we can hardly expect the essence of a whole methodology of science to be 
captured in a single slogan. The main point of this paper is a criticism of the way 
in which Popper suggests we learn effectively from our mistakes, namely, by 
making bold conjectures and refuting them. It is to some remarks on the refuting 
instances of theories that we now turn. 

Tests of a bold, highly refutable theory that result in a confirmation of the 
theory are informative because of the unlikelihood of the result as assessed prior 
to the tests. In the case of a bold, conjecture, the unlikelihood is a consequence of 
the magnitude of the class of potential falsifiers of the theory compared with the 
class of potential falsifiers of the theory it challenges. The greater the difference 
in the two degrees of falsifiability the more informative the confirmation. An 
analogous line of reasoning leads to the conclusion that, if a test of a theory 
results in its refutation, then such an occurrence will be the more informative 
the less falsifiable the theory is. We learn more from the experimental refutation 
of the law ‘all orbits of planets are ellipses’ than from a refutation of the bolder,, 

* Popper [1963], p. 220. * Popper [1969], pp. 243-4. 
? For a discussion of the notion of independent test see Popper [1957]. 
* Popper [1969], p. 243. 
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more falsifiable law ‘all orbits of planets are circles’, if only for the reason that 
the latter refutation follows from the former one, a circle being a special case of 
an ellipse. The refutation of a conjecture that is bold, will also be seen as less 
informative than the refutation of a conjecture that seemed highly plausible in the 
light of previous knowledge. The refutation of parity conservation is a good 
example of an informative refutation of a highly plausible conjecture. We learn 
effectively from our mistakes, then, by refuting cautious rather than bold con- 
jectures. 

An informative unfalsified theory is one which has a large content and is 
therefore falsifiable to a high degree, and which makes unlikely predictions, and 
which nevertheless has withstood all attempts to refute it. An informative 
falsified theory is one which has small content, and which is therefore falsifiable 
to a low degree, and which makes no unlikely predictions, and which is never- 
theless refuted by experiment. From this it would seem that both bold and 
cautious conjectures have a role to play in science. Further, since we do not know 
in advance whether the theories we propose will be confirmed or refuted by test 
experiments, it is not possible to give a very general criterion in favour of either 
the bold, speculative or the cautious, conservative approach. 

This point loses some of its force as a criticism of Popper when it is recognised 
that the refutation of a cautious conjecture and the confirmation of a bold one 
often come together, as a result of one experiment designed to test a bold con- 
jecture. For, certainly if the conjecture is bold,, a positive result of the test 
experiment will refute the cautious conjecture implied in the acceptance of the 
background knowledge with reference to which the prediction of the new theory 
was judged to be unlikely. However, the falsification of a cautious hypothesis is 
not necessarily accompanied by the confirmation of a rival hypothesis, as will be 
clear from two of the examples in section 2 of this paper. 

In spite of the reservation in the previous paragraph, the case argued so far 
constitutes a criticism of Popper’s often misplaced emphasis on falsification, 
exemplified in the quotation on the first page of the paper. On at least one occa- 
sion Popper himself was led astray by his mistaken emphasis, when he wrote of 
experiments designed to test a new theory that ‘even if these should at once lead 
to the refutation of the theory, our factual knowledge will have grown through 
the unexpected results of the new experiments’. For if the theory under test has 
the ‘bold’ character advocated by Popper, then whether ‘bold’ is interpreted as 
bold, or bold, precisely the reverse is true. It is the confirmation of the theory that 
is unexpected, not its refutation. 

According to Popper’s account of science, appropriately interpreted, and in- 
deed, according to most accounts, a theory is to be judged largely (though not 
exclusively) by the extent to which it is supported by experiment. Popper’s 
important insight was to point out the correlation between the degree of support 
and the boldness, the improbability, of the theory under test. Falsification of a 
theory can also be informative on occasions, provided the falsified theory is a 
cautious rather than a bold one. Only in the latter case can it be said that we learn 
from our mistakes. 


2 In this section the claims made in section r are ported by historical 
examples, mainly centrèd on the work of Clerk Maxwell. 


1 Popper [1969], p. 242, my italics. 


168 A.F. Chalmers 


The kinetic theory which Maxwell helped to develop on a statistical basis was 
in some senses cautious, and because of this its subsequent refutation was highly 
informative. Maxwell’s theory of electric charge was also cautious, but was too 
vague to be refuted, and consequently it was not fruitful. It was the more specu- 
lative theory of charge involved in Lorentz’s electron theory (see his [1935]), the 
first version of which was published in 1892, that was to pay dividends, by 
predicting results that were confirmed by experiment. 

The foundations of the statistical kinetic theory were laid by Maxwell and 
Boltzmann in the seventh and eighth decades of the nineteenth century. The 
basic tenets of the theory as applied to gases can be written as follows: 


(a) Gases are made up of molecules in random motion. 


(6) The molecules form a dynamical system governed by the principles of 
Newtonian mechanics. 


(c) The velocities associated with each degree of freedom of the molecules 
follow the Maxwell-Boltzmann distribution law. 


From these assumptions it is possible to derive an inequality involving the ratio 
of the principal specific heats of a gas, as is described below. This inequality was 
convincingly refuted by experiment. 

The Maxwell-Boltzmann theory was to some degree cautious as opposed to 
bold, and to bold,. 

The theory was cautious as opposed to bold, because it did not need to assume 
any details of the interactions between molecules or parts of molecules. The 
inequality that it required specific heats to satisfy followed whatever these 
interactions might be. The theory characterised by (a), (b) and (c) only was more 
cautious than any theory that conjoined specific assumptions about interactions to 
those fundamentals (as in Maxwell’s own account of the viscosity of gases, for 
example).? The essential Maxwell-Boltzmann theory was also more cautious 
than a speculative theory proposed around the same time by William Thomson 
which aimed to explain the behaviour of molecules by assuming them to be 
vortices in an all-pervasive aether. 

The Maxwell-Boltzmann theory was cautious, as opposed to bold,, in so far as 
its basic assumptions were well confirmed and widely accepted at the time. 
Most physicists, certainly by the fourth quarter of the nineteenth century, 
accepted the general validity and applicability of Newton’s laws and that matter 


1 Maxwell himself did not present the situation quite like this. At least on occasions he 
regarded (a) and (c) to be derivable from the phenomena on the assumption that the 
physical world is a material system governed by Newton’s laws. For a detailed discussion 
of the relevant aspects of Maxwell’s methodology see Chalmers [1973@]. See also Dorling 
[1970] for an analysis of Maxwell’s attempt to derive the foundations of his kinetic theory 
from the phenomena. 

3 Such a theory need not be ad hoc, in Lakatos’s sense (Lakatos [1970], p. 175) and may 
well contain a ‘simple, new and powerful, unifying idea’, as demanded by Popper 
(Popper [1969], p. 241). For instance, a simple law might be proposed specifying the 

force between molecules as a function of their atomic weight or position in the periodic 
table, or the force might be deduced from a general theory such as Thomson’s vortex 
atom theory. 
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is made up of molecules, although of course there were notable exceptions such 
as Mach and Duhem, for whom any assumption involving the existence of mole- 
cules was too bold. With regard to assumption (c), no alternative to classical 
(Boltzmann) statistics was envisaged, and the Maxwell-Boltzmann distribution 
follows from this fortified by some very plausible assumptions.® 

Let us look at the account of specific heats offered by the theory in a little 
detail. 

Tt is a consequence of the Maxwell-Boltzmann distribution law that with each 
degree of freedom possessed by a molecule there is associated a kinetic energy 
$ht, where k is Boltzmann’s constant. Given this, it follows in a straightforward 
way that the ratio y of the principal specific heats of a gas is given by 


= Cp _ n+e+2 
are Co mte 





where z is the number of degrees of freedom possessed by a molecule of the gas 
and e is a measure of the potential energy acquired by a molecule of a gas per 
unit rise in temperature? 

If the molecules of a gas are monatomic, and if it is assumed that the molecules 
are smooth spheres or points so that they cannot be set rotating and are incapable 
of distortion or internal vibration, then e = o and n = 3, the three translational 
degrees of freedom of each molecule. Consequently we predict y = 1-667 for 
such a monatomic gas. The next simplest type of gas would seem to be one whose 
molecules are comprised of two atoms of the type already considered joined 
together. For a diatomic gas of this kind we have n = 6 and 


so that 1< y <1-334. Any further increase in the complexity of the molecules of 
the gas will decrease the values predicted for y. The beauty of these predictions 
concerning the outcome of specific heat measurements is that they follow from 
the very general assumptions (a), (b) and (c), together with some very general 
considerations concerning degrees of freedom. No experimental data need be 
fed in. 


1 In 1893, in the Preface to The Principles of Mechanics, Hertz (Hertz [1894]) confidently 
asserted that ‘the problem of physics consists in tracing the phenomena of nature back to 
the simple laws of mechanics’. Maxwell, for one, certainly agreed with him. (For details’ 
see Chalmers [1973a]). In support of assumption (a) Maxwell, in 1870, claimed that ‘the 
evidence from different and independent sources is now crowding | in upon us which 
compels us to admit that’ if we were to continually subdivide matter ‘we should come to 
a limit because each portion would then contain only one molecule’ (Maxwell [1870], 
pp. 221-2). He proceeded to invoke considerable support for his claim. 

2 Maxwell first derived the distribution law from the lews of probability in 1859 (Maxwell 
[1860], pp. 380-1). The assumptions implicit in his derivation are (¢) The number of 
molecules per unit volume is the same in any region of a volume of gas, (#) There is no 
preferred direction in a gas, (#) The velocities along any three coordinate axes are in- 
dependent of each other, and (tv) The number of molecules possessing a velocity in a 
given interval is a function of the velocity and the size of the interval only. Boltzmann was 
to improve on Maxwell’s treatment in so far as he did not assume (tii). Rather, it followed 
from his derivation that (i) is satisfied for elastic collisions. Maxwell accepted Boltz- 
mann’s improvement (Maxwell [1866], p. 42). 

7 For Maxwell’s derivation see (Maxwell [1875], pp. 431-3}. 
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How did the data available in 1875 fit these predictions? Maxwell was aware 
that the experimentally determined values of y for several gases such as Hydrogen, 
Oxygen and Nitrogen were in the region of 1-4. This is inconsistent with the 
predictions of the theory as stated above, and Maxwell himself described the 
state of affairs as ‘the greatest difficulty yet encountered by the molecular theory’? 
It is clear that if the theory is to be brought into line with experiment then the 
number of degrees of freedom of supposedly diatomic gases must be less than 
6 for some reason. One possibility of reducing the number was adopted by 
Boltzmann. In addition to assuming that the atoms comprising the diatomic 
molecules were smooth spheres so that rotations about the axis joining the two 
atoms could be discounted, Boltzmann assumed that the connecting links 
between the pair of atoms be perfectly rigid. In this case we have n = 5 and 
y< 1-400, which is compatible with the experimental data. Maxwell was not 
happy with this assumption. He pointed out* that the theory demanded that the 
molecules must be infinitely rigid and not just very highly rigid if the vibrational 
degrees of freedom are to be ignored, and that this leads to difficulties when 
molecular collisions are considered. 

Even if Boltzmann’s amendment is accepted, the theory is not saved from 
refutation as soon as the spectra of gases are taken into account. The situation 
was brought out very clearly as a result of the measurement of y for mercury 
vapour by Kundt and Warburg in their [1876]. Chemical evidence suggested 
that mercury vapour is monatomic, and consequently, when the measurement of 
y for that gas yielded a value of 1-666, this seemed to be an excellent confirmation 
of the Maxwell-Boltzmann theory. But, given that mercury vapour has a spec- 
trum, then anyone, like Maxwell, committed to a mechanical explanation of 
spectra is forced to conclude that molecules of mercury vapour are able to vibrate. 
But once this is admitted the number of degrees of freedom of each molecule is 
greater than 3, and hence the theory predicts that the value of y should be signifi- 
cantly less than 1-667. Not only do the experimental facts refute the theory, but 
it is also possible to see that no modification or additional assumption can save 
the conjunction of assumptions (a), (b) and (c). Attempts to save the basis of the 
theory by introducing the interaction of molecules and the aether, as suggested 
by Boltzmann, only make matters worse as Maxwell in his [1877] pointed out, 
since such measures introduce an infinite number of degrees of freedom, leading 
to a value of unity for the predicted value of y. The same can be said of any 
continuum theory such as the vortex atom theory. Maxwell himself made it quite 
clear that he regarded the specific heats problem as a refutation of the founda- 
tions of his theory. In a review of H. A. Watson’s book on the kinetic theory of 
gases, in which the consequences of the Maxwell-Boltzmann theory were clearly 
demonstrated, Maxwell wrote?: 

The clear way in which Mr. Watson has demonstrated these propositions 
leaves us no escape from the terrible generality of his results. Some of these, 
no doubt, are very satisfactory to us in our present state of opinion about the 
constitution of bodies, but there are others which are likely to startle us out 
of our complacency and perhaps ultimately to drive us out of all the hypo- 
theses in which we have hitherto found refuge into that state of thoroughly 
* conscious ignorance which is a prelude to every real advance in knowledge. 


' 1 Maxwell [1875], p. 433. 2 Maxwell [1877], p. 245. 3 Tbid., pp. 245-6. 
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By proceeding cautiously, then, and by attempting to develop his kinetic 
theory by appealing to the general assumptions (a), (b) and (c) only, Maxwell was 
led toʻa very informative refutation. Dorling in his [1970]! has described Max- 
weil’s method in his work on kinetic theory as a ‘way of devising as severe as 
possible a test of classical physica’. The results of the test showed Maxwell that 
there was something wrong with his basic assumptions. Popper in his [1969]? 
advises that ‘in searching for the truth, it may be our best plan to start by 
criticizing our most cherished beliefs’. Maxwell’s experiences with his kinetic 
theory demonstrates a sense in which this can be done effectively by proceeding 
cautiously rather than boldly, by investigating the consequences of our most 
cherished beliefs only, and not by conjoining them with additional hypotheses 
which will increase the content and falsifiability of our theory and which will 
consequently devalue the information content of a subsequent refutation.? 

According to my presentation of it, Maxwell’s cautious approach to his 
kinetic theory paid off because, in spite of his caution, the theory was refuted. In 
contrast to this, Maxwell’s cautious approach did not pay off in his electro- 
magnetic researches. The foundations of his theory were not refuted by experi- 
ment. The spectacular successes of his theory were brought about because he 
somewhat inadvertently diverged from his cautious methodology. His theory did 
suffer from some serious limitations, however, and these can be attributed to his 
caution. 

The major predictions of Maxwell’s electromagnetic theory, namely, the pro- 
pogation of electromagnetic effects in time and an electromagnetic theory of 
light, were made possible by Maxwell’s introduction of a displacement current. 
In view of the logic of the situation this move by Maxwell is a classic example of a 
bold conjecture in Popper’s sense. However, Maxwell himself never fully 
acknowledged the hypothetical status of his displacement current and at times 
tried to pass it off as ‘deduced from experimental facts’ or deduced from ‘ad- 
mitted facts’. Ignoring Maxwell’s false claim, Hertz’s production of electro- 
magnetic waves, a spectacular confirmation of Maxwell’s theory, is an excellent 
example of a bold conjecture bearing fruit. It is also an instance of learning from 
our successes. 


* Dorling [1970], p. 236. * Popper [1969], p. 6. 

3 It is interesting to note that work on the kinetic theory continued in spite of its refuta- 
tion by specific heat measurements. One argument in support of such a course of action 
has been stressed in Lakatos [1970], who notes that apparent refutations are often viewed 
as anomalies rather than refuting instances and are tolerated in the hope that future 
developments in the theory (or research programme) will account for them, turning 
them into confirming instances. But the refutation of the kinetic theory was not viewed in 
this way, at least by Maxwell. Not only did he regard the theory as refuted, but he re- 
cognised that no change, short of rejecting some of the basic foundations of the theory, 
could enable the specific heat measurements to be accounted for. And yet Maxwell and 
his successors continued to work on the theory. Work on the development of a theory 
known to be false can be productive in two ways. Such work may lead to additional 
refutations of the theory which may give useful pointers towards the character of the new 
theory, and it may yield significant novel results that are confirmed by experiment. As far 
as classical-statistical mechanics is concerned, an example of the first kind is the pre- 
diction of the ultra-violet catastrophe and an example of the secénd kind Einstein’s 
analysis of Brownian Motion. 

$ Maxwell [1864], p. 564 antl Maxwell [1868], p. 138. For a detailed account of the origin’ 
and status of Maxwell’s displacement current see Chalmers [1973a]. 
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Although, when he introduced his displacement current, Maxwell fortunately 
departed from the cautious methodology he explicitly advocated, the rest of his 
theory still bore the marks of his anti-hypothetical inclinations. This is most 
apparent in his account of electric charge. Among the general equations of 
Maxwell’s theory was the relation 


dio D =p 


between the charge density, p, and, D, the electric displacement. Maxwell 
regarded D as a vector representing some state of the medium surrounding 
electrified bodies and he identified charge with the divergence of that vector. 
He did not venture to suggest the detailed description of the state, however, and 
he certainly declined to assume that electricity took the form of one or two local- 
ised fluids which was a common assumption among the Continental, action at a 
distance, theorists. This vagueness over the nature of charge, which was to some 
extent admitted and defended by Maxwell, was inherited by many of Maxwell’s 
successors. The theory that they developed was unable to provide an adequate 
account of electromagnetiam for bodies in motion or of the optical and electrical 
properties of bodies in terms of their microstructure, and this was a direct 
consequence of the cautious nature of the theory. The bolder theory of Lorentz, 
on the other hand, which owed much to the action at a distance tradition and 
which postulated the existence of electrons, was progressive and informative on 
account of its successes. It offered an explanation of the success of Fresnel’s 
optical theory for moving bodies and also yielded accounts of the electrical and 
optical properties of materials. Lorentz was able to predict in some detail the 
effect of a magnetic field on spectral lines. Zeeman in his [1897], described these 
predictions as ‘extremely remarkable’, indicating their bold, character, but 
nevertheless was able to confirm them by experiment.? 

We have now considered examples of a cautious theory, Maxwell’skinetictheory, 
being highly informative on account of its refutation; a cautious theory, Maxwell’s 
account of the nature of electricity, which was not informative because it was not 
refuted; and bold conjectures, Maxwell’s displacement current and Lorentz’s 
electron theory, which were highly informative because they were confirmed by 
experiment. Another example of the first kind is the informative refutation of 
naive set theory, which was surely cautious as opposed to bold,, by Russell. 
Support for the case argued in this paper is completed by instances of the fourth 
possibility, bold conjectures that were not confirmed by experiment. One such 
example is the failure of the attempt by Tisserand‘ to account for the motion of 
Mercury’s perihelion by introducing velocity dependent forces. The refutations 


1 Maxwell [1892], Vol. 1, p. x. For a discussion of this and other inadequacies of Maxwell’s 
theory see Chalmers [19738]. 

3 Zeeman [1897], p. 232. 

® Zeeman’s initial experimental investigation of the effect of magnetism on spectral lines 
was not suggested to him by Lorentz’s theory. However, having found that spectral 
lines become broadened in a magnetic field, Zeeman wrote to Lorentz asking how the 
effect might be explained by the latter’s electron theory. Lorentz responded with a 
detailed account of the effect and predicted among other things, that the light at the 
edges of the broadened lines should be circularly polarised. Zeeman was able to confirm 

* this prediction. The historical details are reported in Zeemah [1897]. 

4 Tisserand [1872]. 
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of such bold conjectures were not particularly informative and had negligeable 
influence on the development of physics.+ 


A. F. CHALMERS 


University of Sydney 
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VERISIMILITUDE VERSUS PROBABLE VERISIMILITUDE * 


The concept of verisimilitude was introduced by Popper ([1963], Chapter 10, 
Sections IX-XIII, and Addendum 3) as part of a development of his earlier 
views. The notion is explained in terms of the ‘content’ of a statement or theory, 
that is, the class of its logical consequences. This class is then divided into the 
sub-class of the true logical consequences, the ‘truth-content’, and the sub- 
class of the false logical consequences, the ‘falsity-content’. (The falsity-content 
does not contain any of the true consequences derivable from the false state- 
ments which form its elements.) Verisimilitude is defined as ‘something like’ 
the difference between the truth-content and the falsity-content of a given 
theory.) Successive theories can then be compared according to their degree of 
verisimilitude. Such comparison can only be tentative; a point to which I shall 
return later. For the moment, however, I wish to discuss a criticism of the 
Popperian notion as it has been put forward by Swinburne.? I shall argue that 
this criticism is not justified. 

Popper’s main concern, when he introduced verisimilitude, was to save 
realism: to bolster up the doctrine that science aims at a true description of the 
world by making respectable such ideas as approximation to the truth and 
getting nearer to the truth. The relationship between verisimilitude and other 
Popperian notions such as corroboration—the extent to which a theory has in 
the past stood up to severe tests—is by no means unproblematic. Thus, Lakatos 
says that we are not justified in claiming that even the greatest past successes 
give ‘evidential support’ to assumptions about the future success of a given 
theory unless we make the ‘tentative metaphysical assumption that increasing 
corroboration is a sign of increasing verisimilitude’.® 

In his discussion of verisimilitude, Swinburne seems to argue as follows. 
Since a scientific theory consists of universal or statistical hypotheses, the 
consequences will be infinite in number.* Hence, the notions of unit of content, 
truth-content and falsity-content must be modified in some way to enable us 
to handle them. Swinburne’s modification takes the form of introducing two 
new notions, namely, a unit of content such that each theory has only a finite 
number of such units and a concept labelled ‘probable verisimilitude’. This 
latter concept is defined in terms of the probably true consequences and the 
probably false consequences of a given theory. Given these restrictions, 
Swinburne’s conclusion is that ‘accepting the better corroborated and more 
testable theory is not necessarily or in general likely to lead to acceptance of 
theories of increasing verisimilitude.’ 5 

Now this conclusion highlights the difficulties mentioned earlier and may be 
unexceptionable but the means by which it is reached are highly suspect. What 
is the status of ‘probable verisimilitude’? Popperian verisimilitude is ‘timeless’ 


* Editor’s Note: This discussion note was received and accepted before the publication of 
Popper [1972], in Chapter 2 of which Popper further develops his views on verisimilitude. 

1 Popper [1963], P- 393. ® Swinburne [1971], pp. 173-6. 

3 Lakatos [1968], pe 405. Lakatos has recently developed this theme further (see his liors 
section 2). 

.‘ In fact, of course, any statement, whether universal or singular, has denumerably many 

consequences. 

® Swinburne [1971], p. 175. 
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since it depends upon the comparison of true and false consequences, whereas 
Swinburne’s consideration of consequences which are only probably true or 
probably false introduces a strong temporal element. A statement can be im- 
probable today and probable tomorrow. Swinburne has thus distorted the 
whole notion and, whatever he is doing, he is not talking about the idea which 
Popper put forward.? 

Of course this is not to deny that the comparison of the classes of true and 
false consequences is a risky business. Popper admits that we cannot Anow that 
the theory z, has a higher degree of verisimilitude than the theory žų and adds 
that we can only guess.3 Looked at in one way, this is not surprising: for Popper, 
fallibility is ubiquitous. In fact, estimates of verisimilitude are at least doubly 
fallible. This becomes clear if we consider initially the assignment of a degree 
of corroboration to a given theory. Such an assignment is bound to be fallible 
since it is dependent upon the confrontation of the theory with certain basic 
statements which act as potential falsifiers and the assignment of truth-values to 
these basic statements is fallible.* In addition, it has already been pointed out 
that the relationship between corroboration and verisimilitude is such that it 
rests upon a ‘tentative metaphysical assumption’, that is, upon a principle which 
is itself fallible. It follows that our estimates of degrees of verisimilitude are 
tentative, that guessing is the best that we can do. 

Swinburne, however, takes another line. His point is that we can know the 
truth or falsity of only a finite number of ‘particular consequences of a theory’.® 
This leads to ‘probable verisimilitude’ and the conclusions based upon it. It 
leads also to a number of errors. Thus, Swinburne contends that the ‘known 
past truth-content minus known past falsity-content’ of a theory is zero, because 
‘if an infinite number of predictions . . . means a finite amount of content, then 
surely a finite number of... predictions means zero content’, and it is the 
case that ‘the truth-content of any proposition, as also its falsity-content, is 
less than or equal to its content’. Thus ‘probable verisimilitude’, which Swin- 
burne for no logically sufficient reason that I can see, equates with ‘(known past 
truth-content minus known past falsity-content) plus (probable future truth- 
content minus probable future falsity-content)’, reduces to the second term 
of the sum. But the first part of this argument is technically erroneous (leaving 
aside the question of the utility of these notions): Swinburne seems to overlook 
the standard notion of a measure,” and Popper quite emphatically discusses his 


1 In the sense that p(h, e) alters as e accumulates in time. (There is nothing to suggest that 
Swinburne is using the concept of probability in any other sense.) 

2 [Added in the galleys]: A similar point has been made by Miller in his reply to Robinson’s 
criticisms of Popper’s notion of verisimilitude. (Miller [1972], p. 52.) 

* Popper [1963], p. 234- 

4 Cf. Lakatos’s emphasis on the conventionalist element in Popper’s philosophy. Among 
the implications of this is the point that the truth-values of singular statements are (or, 
strictly, may be) assigned by decision and are open to challenge by testing. (Lakatos 
[1970], pp. 103 ff.) 

§ Swinburne [1971], p. 174. € Ibid., pp. 174-5. 

7! We can assume a normalised measure » defined on the field of classes of models of 
sentences of some language. Then we can define the measure of content of a sentence a 
to be the measure u of the class of models of ~a, that is, content (a) = 1—(M(a)), 
where M(a) is the class of models of a. Hence, we have o < content (a) < 1 and, if a 
and b have no content in common, content (a & b} = content (a)-+ content (b). 


> 
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idea of verisimilitude with respect to measures of content-classes. Now it is 
quite possible to define a measure on a field of subsets of an infinite set which 
assigns a finite value to a finite subset of the space in question (though—for 
finite measures—the values assigned to a denumerably infinite sequence of 
pairwise disjoint sets must tend to zero). 

In any case, Popper makes it clear that verisimilitude ‘is not an epistemological 
or an epistemic idea’ 1 (italics in the original). Rather, it is an ontological idea 
and so, Swinburne’s apparent attribution to Popper of the view that our lack 
of knowledge arises because we can know the truth or falsity of only a finite 
number of consequences of a given theory is entirely mistaken. 


KEITH E. JONES 
University of Kent at Canterbury 
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HARRE AND NONLOGICAL NECESSITY 


x In his review of Harré’s Principles of Scientific Thinking?, Mr David Miller 
announces that the book is opaque and proceeds to reconstruct its themes. ‘We 
will never illuminate if we don’t'speculate’ (p. 70) and kindred expressions occur 
throughout the review. The result is that Miller’s review is mainly a critique of 
his own speculations, a procedure which makes reviewing a book less arduous 
than usual. Much of what he ascribes to Harré is far off the mark. We aim to give 
a more accurate account of Harré’s views and hopefully to advance the discussion 
of what, after all, are crucial issues. 


2 Miller writes that the problem of induction is the central theme of Harré’s 
book and ascribes to him the view that statements about natural necessities and 
natural kinds can be known with certainty, are indubitable, and are established 
by non-inductive means. He quotes numerous passages from Harré and then 
concludes that ‘the central contention of the book’ is that ‘the problem of in- 
duction can be evaded, and scientific explanations established for certain’ (p. 70). 
‘Briefly, Harré’s argument is this (I think). Science is a rational activity; so if we 


1 Op. cit., P. 234. 3 Miller [1972]. 
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look at what scientists do we should be able to discern rational principles at work, 
some of them, since scientists certainly achieve certainty, constructive ones’ 
(p. 70). For Harré, ‘the verification of existential statements and the divining of 
essences are both non-inductive tasks’ (p. 72). However, ‘Harré’s whole position 
is simply swept away when we realise that even the best scientists, or rather, 
especially the best scientists, are always making mistakes; and that deductive 
logic is one way of detecting these mistakes’ (p. 72). The obvious objection is also 
that Hume shows ‘no factual statement whatever can be established beyond all 
doubt’ (p. 72). Harré’s statements about essences, ‘since they are obtained by 
“analogical inference”, cannot be established as indubitable, except as a matter of 
convention’ (p. 72). 

William James once warned a student that in analysing another person’s 
thought the stringing out of texts leads nowhere unless you have first grasped his 
centre of vision. Miller misses this centre completely. It is simply false that the 
central thrust of the book is the evasion of the problem of induction. Harré’s 
consideration of this specific problem is quite peripheral. If we see clearly 
Harré’s central vision then we will understand not only where the problem of 
induction fits into the overall scheme but will also appreciate why the traditional 
Humean criticisms offered by Miller are beside the point. 

Harré’s claim is that Goodman’s paradoxes, the paradoxes of confirmation, the 
problem of distinguishing nomic and accidental universals, the problem of 
induction, et al., instead of problems to be solved, constitute the reductiones ad 
absurdum of a system which begins with the view that events are invariably and 
irrevocably independent. This independence follows from the Humean concept 
of an event. Events are not conceived in the ordinary sense as changes in a state 
of affairs but as time-slice instants of such states of affairs. That there are no 
connections between events so conceived is a corollary of this ontology. Hume’s 
‘C e ~ E is never self-contradictory’ argument is really redundant in the context 
of this concept of an event. But why accept such an ontology? Such ontological 
atomism was taken to be a consequence of epistemic atomism, but we have 
learned from Dretske and Joske that the latter does not entail the former at all. 
So even if we were to accept Hume’s point of departure that the only things we 
are directly aware of are our own (punctiform) impressions (which scarcely any 
epistemologist would be willing to do nowadays), it would not follow that we 
cannot talk about events in the ordinary sense and about the particulars of science 
and ordinary life that have explanatory dimensions as far as these events are 
concerned. It is generally supposed that the Humean in-principle argument and 
others are relevant to all ontologies. What Harré contends is that they are simply 
consequences of a particular ontology which there seems little reason to hold. 
Hence, Miller’s use of traditional Humean arguments against Harré is so much 
chaff in the wind. What Miller fails to do is attack Harré’s basic contention that 
the Humean critique is itself ontologically bound in a damaging way. Harré 
constructs an elaborate alternative neo-realistic ontology in which the problem of 
induction like all the other puzzles that philosophers of science have spent years 
fruitlessly trying to solve simply do not arise. Miller fails to show that contrary to 
appearances they do arise—which, of course, is identical with the problem of 
showing that Humean arguments are not ontology bound in a damaging sense. . 

Miller is again simply wrong in thinking that Harré ascribes certainty and in- 
dubitability to statements about natural necessity. First let us be clear about 


178 B. Cohen and E. H. Madden 


terminology. The concept of certainty is generally eschewed by contemporary 
epistemologists (as ambiguous and psychologistic) in favour of infallibility. To 
say, then, that a person infallibly believes a statement p to be true is equivalent to 
saying that he is exempt from the possibility of error in his belief that p. In- 
dubitability and incorrigibility are derivative concepts from that of infallibility. 
Any statement which is infallibly believed to be true is then indubitable or in- 
corrigible. Nowhere does Harré even hint at the notion that any scientist is ever 
exempt from the possibility of error in his belief that p. Not only does Harré not 
make such a claim but would consider it a severe blow against his view if it could 
be shown that he was committed to this claim. According to Miller, Harré not 
only holds that explanatory statements in science are indubitable but also that 
the scientist does not use inductive procedures in establishing them—that is, 
that the establishment of such statements is not by a posterior’ procedures. 
Again, not only does Harré not hold this view but would consider it a grave 
difficulty with his system if it could be shown that he is in fact committed to it. 
If Miller had advanced an argument to show that Harré was so committed in- 
stead of ascribing to him such untenable views, he would have contributed 
gomething significant to the issues involved. Though he does not, Miller could 
have claimed (and perhaps was assuming) that there is a material equivalence 
between ‘x is necessary’ and ‘x is a priori’, on the one hand, and ‘x is contingent 
and ‘x is a posteriori’, on the other; and hence that if one claims a scientific 
explanation is necessary he is committed to the view that it is known in an a 
priori fashion. This claim is involved in the traditional Humean argument against 
natural necessities. Given these equivalences and the fact that scientific explana- 
tions are never achieved by a priori means, it follows, according to Hume, that 
any view committed to necessary connections between matters of fact must be 
false.1 However, the crucial point is that the equivalences are Humean assump- 
tions and rarely, if ever, defended. We have shown in detail in several other con- 
texts that there are good reasons for rejecting these equivalences and will limit 
ourselves here to a brief statement of one reason for rejecting the equivalence 
between ‘x is contingent’ and ‘x is a posteriori’ and one for rejecting the equivalence 
between ‘x is necessary’ and ‘x is a priori’. 


3 (i) The break-up between ‘x is contingent’ and ‘x is a posteriori’ can be accom- 
plished by showing that the equivalence claim rests upon a confusion basic to the 
Humean way of thinking. This confusion is a variation on the one everywhere 
present in Humean thought, namely, the confusion between the meaning of a 
proposition and the grounds we have for holding it. This confusion has been 
pointed out in so many contexts that we will not repeat the argument but assume 
it to be known. The variation of this confusion, which is crucial in the present 
context, is a second-order one. When the Humean claims that all propositions 
known a posteriori must be contingent, he is assuming that a proposition itself 
must be characterised by, or have its nature determined by, the way we come to 
know, or believe, it to be true. However, it is clear that the contingency or neces- 
sity of a proposition, like its meaning, is independent of the way the proposition 
comes to be accepted as true, or known to be true. There may be good grounds 
«for my believing that a certain truck is gasoline powered. It has only a faint 
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exhaust, the exhaust pipe is in the usual position, the driver stopped at a station 
which sells only gasoline, and so on. It is surely correct to say that my belief is 
based on a posteriori grounds and it is always possible that I have been mistaken. 
The driver perhaps was using an especially high grade of diesel fuel, had changed 
the position of the pipe, had stopped at the station only to ask directions, and so 
on. But the dubitability, or corrigibility, of propositions about this truck being 
gasoline powered, or diesel powered, which reflects the relationship between the 
acceptability of a proposition and the evidence for it says nothing about the 
intrinsic nature, characterisation of, or meaning of the proposition itself, 
Assuredly the proposition that the truck is gasoline powered is known to be true 
a posteriori and it is always possible that a person errs in believing it to be true; 
but these facts leave untouched the natural necessity which the proposition itself 
can be said to presuppose if it is true at all, if it, in fact, constitutes a genuine 
explanation. Again, the characterisation of an explanation as involving the 
concept of natural necessity, or natural kinds, exists quite independently of how 
a proposition concerning it comes to be held as true in a particular case. The 
nature of gasoline still explains why it explodes when detonated even though it 
was a mistake to claim that this nature provided an explanation of how the truck 
was powered in the present case. Whether or not the view that explanation 
requires an uneliminable concept of natural kind is correct, the claim that there 
are natural kinds is perfectly compatible with the notion that statements about 
them are acceptable only as a result of inductive procedures. 

(ï) The break-up between ‘x is necessary’ and ‘x is a priori’ can be established 
in the following way. If there are no necessary connections between matters of 
fact, as the contemporary Humean positivist claims, then no confirmation in- 
stance adds any probability whatever to any inductive inference. One has to 
assume, then, enough necessary connections so that the possibility exists that any 
two properties are necessarily connected. Then evidence of a regularity is the 
first evidence which increases the chance the properties of a certain pair are 
necessarily connected. And 


From this it is inferred deductively that the same data makes more probable 
the claim that the same properties are always connected. There is then 
empirical support for claims for which there is also empirical support that 
they are necessary. And there seems to be no non-empirical support for 
such claims. So if there are any physical necessities at all, they are a 
‘ori 
This is the familiar point that not all ensembles of predicates are equally likely to 
be manifested in nature. To assume that they are is to base one’s epistemology 
upon a metaphysical theory that the having of a property by an individual is to be 
construed as an event, and that this event is atomistic, that is, unconnected with 
any other event, including the having of any other property by the individual. 
But, of course, events are atomistic only if they are instantaneous time-slices, and 
the having of a property by an individual is certainly not such an event. 
We have been claiming that for Harré statements about explanations are not 
arrived at by a priori means. While this claim is perfectly true there is one way in 
which it is misleading. It is certainly true, as Hume claims, that from the heat and 
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light of fire a person would be unable to infer that it would consume him. But for 
Harré this result hardly comes as a surprise; it is simply a truistic consequence of 
the notion of the independence of all predicates. Of course, given only some of the 
sensible qualities of a material or thing one could not infer all the ways it would 
behave. But were we to know the nature of the material or thing (for instance the 
structure of its component atoms), he would add, we should certainly be able to 
make these inferences. Bacon had grasped this point, and it lay behind his 
insistence that scientific knowledge must be found in the forms or natures of 
things, not in mere concomitances of their sensible qualities. As he points out in 
the example of fire, which produces induration in clay but colliquation in wax, 
the concomitances of sensible qualities may often be equivocal. The job of the 
scientist, he contended, is to find what it is in the nature of fire and the things and 
materials it acts upon that explains these very different effects. If we know the 
nature of a particular that explains its properties, powers, and capacities and 
relates them into intelligible clusters then we can indeed infer from some of the 
powers and properties of that particular to others via its essential nature. Indeed, 
if we are successful in establishing the nature of the particular that relates its 
separate powers and capacities, we are in a good position to infer a priori the 
existence of still other powers and dispositional properties. A priori elucidations 
of the natures of particulars appear plentifully in science. 

The important fact still remains, however, that in establishing the nature of 
things and materials initially we go inferentially from properties, powers, and 
capacities to a hypothesis of a specific nature that explains them and that this 
backward procedure is of an a posteriori character utilising imaginative model 
building and analogy. But this backward inferential procedure, Harré would 
emphasise, enables us to choose a model for the nature that will be successful but 
does not affect the conceptual relationship between whatever nature is established 
and the powers and capacities it explains. There are usually competitive, N, N’, 
N” models of the nature of things and materials and the one is chosen which is 
successful in explaining the most powers and capacities of particulars and in 
leading to the discovery of heretofore unknown powers and capacities whose 
plausibility is guaranteed by the usual rational constraints on the imagination in 
model-building. This is the a posteriori aspect of the scientific enterprise, and it is, 
of course, essential. But once we see that its role is to decide which conceptual 
framework in fact is explanatory in a given area we also see that it in no way 
interferes with the necessary relation between whatever nature is decided upon 
and the powers and capacities of particulars that that nature explains. 


4 In addition to his general misconceptions, Miller is often mistaken on specific 
points. He seems unable to distinguish action at a distance between atoms from 
fields of potentiality with singularities. Boscovich’s theory, which Harré uses as 
an illustration of the move from atomism to fields, is clearly of the latter sort 
despite the fact that Boscovich refers to his singularities as ‘atoms’. This point 
has been made very clearly by Supek in his commentary upon the Theoria. , 
In his remarks about the desiderata offered as the minimal conditions for a 
relation to be spatial, Miller confuses necessary with sufficient conditions. Harré 
offers the three desiderata only as some amongst the necessary conditions for a 
_ relation to be spatial. Of course, there will be interpretations of the relation 
- defined in the three desiderata other than ‘between’ in the spatial sense. We guess 
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that it would be impossible to state formally the sufficient condition for a relation 
to be the ‘between’ of space. Miller’s superficial reading of the argument has led 
him to overlook the general condition laid down at the very beginning of the 
discussion for spatial and temporal relations, namely, that they be acausal. This 
condition eliminates all those interpretations of ‘between’ which involve or con- 
tain a causal element. 

The superficiality of Miller’s reading is further evident in his critical but 
unsubstantiated remark about Harré’s discussion of alternative interpretation of 
the desiderata. That discussion is intended to show how difficult it would be to 
interpret the terms of the relation of primitive ‘betweenness’ as qualities rather 
than things. The three desiderata are shown not to hold for colours. Nothing 
whatever is said or implied about alternative interpretations of the relation of 
primitive ‘betweenness’, of which there must be an enormous variety. What is 
maintained with the aid of the colours example is that whatever specific acausal 
interpretations of that relation are possible the relation will hold between thing- 
like entities and will not hold between quality-like entities. 

It must not be supposed that we think Miller’s review is worthless. In spite of 
his unsympathetic reading of the book, Miller, in a few places, comes close to 
an important issue and occasionally even hits the target. Miller writes that Harré 
goes ‘right off the rails’ by suggesting that the deductivist believes all terms in a 
scientific system must be ostensively defined. It is not clear that Harré holds this 
view, since he seriously considers reduction sentences in several places, but in 
any case it is clear that he does not discuss the nomological network view of 
Carnap and others. However, we have discussed this view on several occasions 
and believe it poses no problem for Harré. It is quite easy to construct phony 
systems that have theoretical components and meet the requirements of deduc- 
tive inter-relationship and indirect confirmation and yet have no explanatory 
value whatever. 

Having come close to an issue, Miller is off again on a tangent. Miller writes: 
“The suggestion on page 18 that explanans of the deductivist kind cannot be 
established for certain is of course true, but hardly relevant, as the same is true of 
all others’ (p. 71). Anyone who reads page 18 of Harré’s book will see that no 
such suggestion occurs. Miller seems obsessed with the mistaken idea that 
Harré seeks indubitable knowledge of which one can be certain. 

Miller observes that ‘the whole point of positing permanent entities was to 
explain how change is possible; such entities cannot therefore on their own explain 
why change does not occur’ (p. 73). What does it mean to explain why something 
does not change? On Harré’s view Aristotelian particulars can and do change, but 
Parmenidean particulars cannot and do not change. On the level of Parmenidean 
particulars it makes no sense whatever to ask why change does not occur—that 
they cannot follows from what we mean by such a particular. Such particulars 
can and do explain why change occurs in Aristotelian particulars. But what 
sense does it make to ask why Aristotelian particulars do not change even though 
they could? On Harré’s view, whatever changes requires explanation (and cannot 
itself be part of an ultimate explanation), while whatever remains constant or 
unchanged requires no explanation. If the stick of dynamite explodes we have to 
explain why, and this we do by saying it was detonated and then explaining the 
volatile nature of dynamite. However, the continued existence of the stick of 
dynamite requires no explanation except in the negative sense of saying nothing .” 
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happened to it—it was not badly shaken, lighted, and so on. It might be objected 
that, if a stick of dynamite were ignited and yet did not explode, we would 
indeed have the problem of explaining why the stick continued to exist in its 
quiescent and latent state. However, this counter-example fails, since change can 
be seen to be involved if one delves below the surface features. Clearly there has 
been a change in powers and hence there must be a change in the nature of the 
particular involved to explain that change. The dynamite, we discover, was wet, 
and its structure was changed momentarily so that it was for the moment 
non-volatile. 

One final point. Examples of an empirical real essence from Harré’s point of 
view would be the use of chemical formulae to represent the constitution of a 
substance, or of the molecular hypothesis to represent the nature of gases. 
Instead of arguing against this application of the notion of a nature of real 
essence, Miller offers the following curious bit of reasoning. He attacks the use of 
the concept of real essence in the analysis of the concept of field of potential, 
admits the role of hypotheses about fundamental fields in the explanation of the 
behaviour of things constituted by such fields, and then claims that this shows 
that the explanation of chemical behaviour in terms of the chemical constitution 
and structure of the molecules of elements and compounds will not ‘work’ 
either. 

If Miller means that the reference to the chemical constitution of a substance 
in explanation of its chemical behaviour is not an ultimate or final explanation of 
that behaviour, we have no quarrel, since this is precisely what Harré is main- 
taining. If he means that such reference is not part of some sort of explanation 
then he is just wrong. Whether or not this concept of potential can properly be 
said to figure in an account of the nature or real essence of a field is clearly quite 
irrelevant to the point. Since that is the only argument he offers, the analysis 
must stand until either some other argument can be offered, or an adequate 
range of counter-examples can be assembled to show the parochial character of 
Harré’s analysis. 

BARRY COHEN 
Youngstown State University 
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Reviews 


CoueEn, R. S. and Warrorsxy, M. W. (eds.) [1969]: Boston Studies in the 
Philosophy of Science, 5, Dordrecht, Holland: D. Reidel. Dfl. 58. 
Pp. viii+482. 


Volume 5 of Boston Studies in the Philosophy of Science is a mixed collection of 
essays on various topics in the history and philosophy of science. As with the 
previous volumes in this series, most of the papers originated in the Boston 
Colloquium for the Philosophy of Science, and the volume is subtitled as 
Proceedings of the Colloquium for the 1966-8 period. However, many of the 
papers have been substantially revised or expanded since their original pre- 
sentation, and others have been added. The essays vary considerably in length, 
from an 150-page monograph by Adolf Griinbaum to articles of less than 10 
pages by Aage Petersen and Paul Roman. A few also include commentaries, some 
substantial contributions in their own right, and others rather brief comments. 
The papers algo vary considerably in degree of technicality, sophistication, and 
detail of argument. Some present the results of important new research and are 
significant contributions to their fields, but will be understood and appreciated 
only by readers with a considerable background in physics or in the more 
technical areas of philosophy, like Tarskian semantics. Others can tantalise but 
sometimes annoy the reader by presenting brief summaries or popularisations of 
material that the author has developed at length elsewhere. Here the more 
sophisticated readers will want to know more before they can properly evaluate 
the arguments. 

As a result, any serious philosopher of science is likely to find much that is 
interesting, novel, and relevant in the book. Nevertheless, the book as a whole is 
not really successful. The range of topics covered is too diverse and disconnected, 
so that it reads more like an expanded version of a journal than a book. A pro- 
spective reader who is interested in only a few of the articles may be put off by 
the price. In addition, he may find those in his particular field at just the wrong 
level. The loose and rather random organisation of the book reflects the structure 
of the colloquium itself, which has always emphasised diversity. However, in 
oral presentations one can have a different audience each time. It would be much 
more effective if the various volumes in this series were to cover a longer time 
‘period, so that they could be organised and grouped by topics. But even within 
this particular volume, it would be better to have articles on related topics 
together, perhaps with an introduction or heading, rather than scattered through- 
out the book. 

The ‘lead essay’, Griinbaum’s ‘Reply to Hilary Putnam’s “An Examination of 
Griinbaum’s Philosophy of Geometry”, is precisely that. Griinbaum analyses 
Putnam’s critique point by point, almost sentence by sentencé, in an extremely 
detailed and meticulous attempt to show that absolutely none of Putnam’s 
charges are justified, on the basis of Griinbaum’s whole corpus of previous’ 
writings in this area. As such, it reads like an elaborate legal brief. 
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The article has three long sections and several shorter ones. In the first major 
section Griinbaum reiterates his defence of Riemann’s conception of the alter- 
native metrisability of the separate space and time continua of physics, although 
he admits that this does not apply to the 4-dimensional space-time interval on 
which Putnam concentrates. This in turn requires that one must select one of 
these incompatible metrisations by a convention, e.g. that a rod chosen as a 
possible standard remain self-congruent under transport. Section 8 is a careful 
and detailed analysis of the meaning of simultaneity in both the special and 
general theories of relativity, and of the conventions required in each case. Here 
Griinbaum emphasizes the importance for Einstein of the fact that material 
clocks transported from one place to another by paths of different lengths do not 
remain synchronised as prior to the fact that light is the fastest possible signal 
in vacuo. He claims that Putnam had mistakenly interpreted him as denying this, 
in describing a ‘quasi-Newtonian’ world in which the latter would be true but 
not the former, despite Griinbaum’s own explicit statement earlier. 

In Section 9 Griinbaum attempts to disprove a ‘theorem’ of Putnam’s, namely 
that any world described by a given system of physics and system of geometry 
can be redescribed using an arbitrarily chosen metric without postulating 
‘universal forces’ which affect all bodies in the same way, on the ground that 
changing the geometry will change the measures of the differential forces 
already assumed to exist. This runs counter to Reichenbach’s claim that we can 
get a unique ‘standard’ physical geometry by requiring that there be no universal 
forces. Although he does not disprove Putnam’s contention formally, Griinbaum 
gives various reasons for believing that it is at best a non-sequitur, and at worst, 
false. Other sections of the easay deal briefly with different senses of conven- 
tionality, which, he claims, Putnam confuses. 

Most of Griinbaum’s arguments, which are generally quite technical, appear 
formally sound, and he makes many valid and worthwhile points. Nevertheless, 
the essay as a whole remains curiously disturbing. Too often the Grinbaum- 
Putnam debate appears as an attempt to score a lot of points against one’s 
opponent (to use Griinbaum’s own language on p. 20) at the expense of the 
underlying subject matter, without appreciating the value of what one’s opponent 
is trying to say. Too often the tone of Gritnbaum’s essay is excessively ad 
hominem, in accusing Putnam of being ‘grossly careless’ (p. 1), ‘insensitive’ 
(p. 45), or even ‘irresponsible’ (p. 85) in studying Griinbaum’s previous writings. 

This is especially unfortunate because the positive positions of Griinbaum and 
Putnam, as opposed to their attacks on each other, often represent differences in 
emphasis, rather than outright contradictions. Griinbaum is not really wrong, 
but he often emphasises the wrong things, thus giving a somewhat distorted 
picture of what Einstein and his successors were trying to do in developing 
general relativity. Einstein’s own writings, for example in his replies to objections 
in the Schilpp volume, indicate that under pressure he was willing to accept the 
Riemann-Reichenbach-Griinbaum conception of congruence. However, the 
possibility of making a choice of geometries by selecting a conventional standard 
of length never played an important rôle in the development of general relativity. 
It is essential to remember that Einstein was not a mathematician or philosopher 
trying to develop a theory of geometry or space; he was a physicist trying to 
_ develop a new theory of gravitation. Such a theory had to be consistent with the 
‘ local validity of the principle of special relativity, the equivalence principle and 
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resultant indistinguishability of homogeneous gravitational fields and accelerated 
systems, the field character and finite transmission of all effects, the actual inhomo- 
geneity of real gravitational fields, and finally, the approximate validity of 
Newton’s theory of gravitation. Einstein accepted all these as empirical facts, and 
they were the crucial factors that led him to choose that Riemannian manifold 
whose geodesics coincided with the paths of freely falling particles as the ‘real’ geo- 
metry of space-time. Griinbaum emphasises the a priori factors that would apply in 
a variety of possible worlds, but both Putnam and Einstein emphasised the specific 
empirical constraints of this world. And Putnam also gives a better account of the 
sense in which the geometry of space-time can attempt to incorporate all of 
physics, rather than just articulating the special relations among conventionally 
chosen objects like rods and clocks. 

Two of the more interesting and provocative, yet difficult and technical 
essays, are those by Peter Havas and John Stachel on ‘Causality Requirements 
and the Theory of Relativity’. Both indicate that in general relativity there are 
serious unsolved problems about the applicability of the classical (Newtonian) 
ideal of determinism to actual physical situations. In classical determinism, if the 
state of a closed system at an instant of absolute time were given, then at any 
other time one and only one possible state could occur. Since the theory allowed 
direct action at a distance and instantaneous transmission of signals, this in- 
formation about the complete initial state was knowable in principle. However, 
the relativistic requirement that no influence could be propagated faster than light 
placed new emphasis on the notion of signal, and restricted us to information 
within our past light cones. Determinism might still hold for truly closed 
systems; however, this is of little practical value if we are constantly receiving 
signals which give us information not contained in our initial data. Under those 
circumstances systems must appear to us as open and undetermined. Deter- 
minism seems most applicable to localised, isolated, and tightly controlled 
laboratory situations, rather than to anything on an astronomical scale. And if 
we try to apply it cosmologically by considering the universe as a whole to be a 
closed system, we are engaging in unverifiable speculation. 

Havas emphasises that mathematically legitimate solutions of Einstein’s 
equations may lead to serious causal anomalies. In addition, causal problems 
arise unless one’s attention is restricted to the uninteresting cases of worlds 
without matter. Otherwise one must introduce singularities or give an adequate 
interpretation of the stress-energy-momentum tensor and its equation of state—a 
difficulty that Einstein himself certainly recognised. Stachel is critical of trying 
to preserve determinism as an ideal, both in view of the necessity of probabilistic 
quantum mechanics and the use of open systems in classical mechanics. If 
general relativity is to be useful or even testable, it must be applicable to open 
systems. Stachel thus de-emphasises the importance of the Cauchy initial value 
problem. He also provides a very intriguing discussion of possible alternatives to 
causality at the microscopic level, in which space-time itself may have a quantum 
structure or be a macroscopic effect of more fundamental interactions, or where 
there may be local signals travelling faster than light. Both articles have ex- 
tensive references to recent research. 

The articles by David Finkelstein, ‘Matter, Space, and Logic’, and by Hilary 
Putnam, ‘Is Logic Empirical?’ take similar stands. Both seek to find a way of 
avoiding the apparent anomalies of quantum mechanics without. accepting ' 
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either Bohr’s complementarity principle or Bohm’s attempt to interpret them as 
the effects of determinate processes involving ‘hidden variables’. Both see an 
alternative in changing the quantum logic, and believe that this alternative is 
both inherently simpler and more in keeping with previous developments in the 
history of science. For Finkelstein previously rigid theories develop ‘fractures’ 
where they no longer hold together at some point. This is followed by a period of 
‘flow’ where the fractured part is replaced by a looser, more flexible, and more 
comprehensive structure. Putnam sees an appropriate parallel in the history of 
geometry. For centuries Euclidean geometry was considered just as unshakeable 
and a priort as classical logic. However, the attempt to give the simplest inter- 
pretation of empirical facts eventually led Einstein to introduce non-Euclidean 
geometry into physics. Putnam thinks that a comparable revolution in logic is 
now appropriate in response to new experience, and that only a dogmatic 
adherent of a rigorous analytic-synthetic distinction should oppose it. 

Finkelstein gives an interesting ‘operational definition’ of the logical con- 
nectives in terms of the tests or experiments we might perform, and the corre- 
sponding results that we would get. If we cannot know a priori what com- 
binations of tests Nature will allow, we may well consider logic empirical. This 
is certainly in the spirit of logic as an ‘organon’ for investigating nature, though 
quite different from the usual formalistic interpretation. On this basis both 
Finkelstein and Putnam conclude that the villain, the factor responsible for the 
anomalies, is not the law of the excluded middle but rather the distributive law. 
Their recommendation avoids having to give up a micro-logic altogether by 
saying that statements about particular properties of a micro-system or their 
combination are simply meaningless. Putnam in fact concludes that his new 
logic allows one to consider quantum mechanics as deterministic but makes the 
initial information needed for complete determination logically contradictory. 

However, this approach still raises problems. Since neither author speaks of 
degrees of distributiveness or anything else, it is not clear how we can consider 
ordinary logic a limiting case of the new quantum logic, still suitable for most 
practical purposes in the way that Euclidean geometry and classical mechanics 
are still good approximations when we have only slightly curved spaces or low 
speeds and masses. In addition, Finkelstein wants to avoid the various dualisms 
of the complementarity approach—of observer versus system, motion versus 
measurement, et cetera. However, he says that we should still use classical logic in 
carrying out computations for the dynamics of the system, and quantum logic 
only for relating these to macroscopic observations. This seems to me a com- 
parable dualism, and not necessarily a better one. 

Aage Petersen, a colleague of Bohr’s, discusses the philosophical significance of 
the correspondence principle, or argument. Petersen sees correspondence, along 
with complementarity, as one of the two central pillars of the quantum revolution, 
or at least of Bohr’s interpretation of it. In the general form to which it evolved, 
it requires that the structural similarity between classical and quantum theories 
must be as close as the quantum constant permits. For Petersen this can effect 
radical changes in our philosophical as well as scientific beliefs. Petersen mentions 
the two most common interpretations of the principle, that it rules out the 
Einsteinian hope of a complete, non-statistical description of all individual 
processes that go on in the world, independent of any observer, and that it 
' introduces, the subject as an active component in the ‘process of knowledge’. 
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However, Petersen emphasises its application to more general problems of 
epistemology and the relation of language to the world, For him a central impli- 
cation is a required harmony between the possibilities of observation and defini- 
tion. This can restrict the applicability, if not the formation of physical concepts. 

R. Fiirth’s discussion of “The Role of Models in Theoretical Physics’ pro- 
vides a useful typology of the kinds of models that are actually used, and the 
goals that a theoretician might hope to accomplish in each case. The account is 
not especially original or critical, but it seems factually accurate. Fiirth’s four 
types of models are (1) functional, which involves postulating a set of idealised 
hypothetical entities obeying perfectly the formalism of the theory; (2) structural 
where the emphasis is less on the kinds of entities than on their dynamic inter- 
relations; (3) scale, involving comparable material systems differing primarily in 
size; and (4) analogue, using similarities between the formalism and others more 
fully worked out in different fields heuristically. 

Two of the briefer papers are pregnant with speculative possibilities. P. 
Roman’s ‘Symmetry in Physics’ attaches great importance to the notions of 
symmetry and invariance in classifying the activities and future possibilities and 
hopes of physics. Both notions relate to structure, and Roman believes that the 
laws of physics fall into one of two structural forms: topological laws governing 
the relative situation of the elements and their dynamic changes, and algebraic 
laws governing the combination of elements into larger wholes. In each case a 
symmetry transformation simply moves from one description of a system to 
another equally possible one. If this transformation leaves certain relations 
unchanged, it represents an invariance. The task of experimental physics is 
then to discover the actual invariances of a system following the symmetries 
defined by theoreticians. Roman’s article provides a programme but does not 
fully justify it, nor does it clearly indicate why the symmetry approach has 
become so popular and relatively successful in elementary particle theory when 
it played a fairly small role previously. It seems to require an ultimate esthetic 
component ‘in nature. 

Carl Friedrich von Weizsicker’s ‘The Unity of Physics’ deals with an even 
larger theme. For despite the increasing specialisation and apparent conflict 
between the conceptual frameworks of, e.g. quantum mechanics and general 
relativity, he believes that a single, ultimate, unified, closed conceptual frame- 
work and physical theory is both a possibility and a legitimate research goal. 
This theory will have to do two things: it must characterise completely the pos- 
sible motions of a arbitrary system of objects, and it must say what the totality 
of these objects can be. At present quantum theory is our best expression of the 
former, elementary particle theory of the latter. 

Weizsäcker seems to rule out the possibility of genuine scientific revolutions, as 
he sees the past history of physics (and presumably its future) as a succession of 
closed theories, each more comprehensive than its predecessors and including 
them as limiting cases. However, his approach is more epistemological than 
historical. In order to do science at all, one must accept the existence and legit- 
imacy of experience. He then offers his central hypothesis, ‘Whoever could 
analyae with sufficient acuity under what conditions experience is at all possible, 
would then have to be able to show that these conditions already entail all 
general laws of physics. The physics thus deduced would precisely be the 
unified physics previously conjectured’. 
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One must be struck by the very Kantian ring of this hypothesis. Certainly 
Kant hoped to do the same thing as Weizsäcker, and confidently expected his 
final theory would include, e.g. Newtonian mechanics and Euclidean geometry. 
Since these have in fact been overthrown, why is such a programme more 
plausible now? Unlike Kant, Weizsäcker does not attempt to give a transcendental 
deduction of these conditions of possible experience, or fix them as synthetic 
a priori. Instead he suggests that the investigation of these conditions may 
itself be empirical and proceed along with the details of scientific growth. 
Rather than being dogmatic, he recommends new and fascinating lines of research. 

Weizsicker’s paper was presented first as a public lecture summarising more 
detailed research, and this compression may make it seem puzzling or obscure. 
Fortunately, Francis J. Zucker provides an excellent and clear commentary 
spelling out more details of the epistemological programme. Five basic concepts 
seem to be needed: object, space, interaction, time, and probability. Four kinds 
of theory are needed: particle theory for elements and material structure, 
cosmology for the totality of interactions, relativity for the totality of motions, 
and thermodynamics for irreversibility and the possibility of measuring traces. 
Weizsiicker hopes that these can all become consequences of an expanded 
quantum mechanics. The interested reader may well turn to Weizsicker’s more 
complete writings on the subject. 

Bernard Grunstra’s lengthy article, ‘On Distinguishing Types of Measure- 
ment’, provides a thorough and comprehensive account of the problems of 
empirical measurement, and the setting up of different kinds of scale. It also 
contains a careful survey of previous literature on the subject. The article seems 
accurate and worthwhile to anyone in this field, though it provides no novel 
insights on an already well-studied topic. 

Two of the essays are primarily ‘historical. In ‘Hypotheses in Newton’s 
Philosophy’ I. Bernard Cohen offers a very careful analysis of the apparent 
discrepancy between Newton’s famous dictum ‘Hypotheses non fingo’ and his 
very considerable use of principles that could only be considered hypotheses by 
Newton’s own standards and were indeed labelled as such. Drawing extensively 
on Newton’s correspondence and the different editions of his works, Cohen 
shows how Newton’s attitude evolved from a tolerance of hypotheses as possible 
explanations with heuristic value to a harder, more ‘positivistic’ line during the 
1690s, so that one must be careful to distinguish the various stages of his intel- 
lectual career. 

‘Ernst Mach’s Biological Theory of Knowledge’, by Milič Capek, is a valuable 
contribution. He shows conclusively that the usual characterisation of Mach as 
an empiricist, sensualist, and phenomenalist fails to bring out the ways in which 
he differed from his British predecessors, or fitted into the late nineteenth- 
century milieu. According to Capek, the missing element is his biological theory 
of knowledge, and especially the rôle of Darwinian evolution. Notions of 
evolution, adaptation, and struggle for survival dominated the intellectual 
community in that period. They were erected into a full metaphysical system by 
Spencer, who influenced both Mach and Poincaré, as well as by Helmholtz. 
Through a process of species adaptation, the character of human thought became 
more closely attuned to the realities of empirical nature, rather than reflecting 
just a lucky accident or pre-established harmony, as Hume’s theory of habit and 
:_ belief in induction suggested. A framework of ideas could thus appear a priori for 
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any given individual, though it was clearly a posteriori for the species as a whole, In 
particular, this led Mach to support determinism, despite his awareness of the 
force of Humean scepticism. Capek makes the interesting claim that Mach and 
Poincaré, unlike the pragmatists, were led to an unnecessary blind spot by this 
very viewpoint. For like Spencer they believed that human evolution was essen- 
tially completed, and thus they dismissed the possibility of radical conceptual 
change in the future, whether in geometry or in physics. This in turn led Mach to 
oppose Einstein. Since Darwin’s theory hardly took present-day man as the end- 
point of evolution, it is hard to see why any of them should have made such an 
interpretation. 

Mihailo Markovie’s essay, ‘The Problem of Truth’, is a careful and rather 
technical attempt to clarify, defend, and expand on Tarski’s theory of truth 
within the context of modern semantics. He gives careful attention to the in- 
formal requirements that we expect any definition of ‘true statement’ to satisfy, 
and their relation to more traditional theories of truth. 

Wolfgang Yourgrau’s ‘Verification or Proof—An Undecided Issue?’ lies 
within the philosophy of mathematics. Yourgrau rejects both the Kantian claim 
that mathematics is synthetic a priori, and the more customary recent claim 
that it is analytic, along with the related dogma that ‘verification’ applies only to 
synthetic empirical statements, and ‘proof’ to the analytic statements of mathe- 
matics and logic. He gives various illustrations of deductive mathematical 
arguments, whether rigorous or not, that would be hard to classify as verifications 
or as proofs, and concludes that the notion of ‘proof’ is not clearly defined. 
Yourgrau rejects the claim that all meaningful statements in mathematics or 
empirical science must be either analytic or synthetic, but he is also unwilling to 
replace that dichotomy with the continuum suggested by Quine and Morton 
White. He concentrates on mathematical statements, and in some cases seems to 
suggest that both terms might apply; in others, notably some of the unsolved 
conjectures of arithmetic that he also discusses, perhaps neither is applicable. In 
any case the subject is more complex than the orthodox accounts imply, and 
further work still needs to be done on the foundations of mathematics. 

June Goodfield’s paper, “Theories and Hypotheses in Biology: Theoretical 
Entities and Functional Explanation’, with commentaries by Ernest Mayr and 
Joseph Agassi, is a welcome addition to this collection. For years philosophers of 
science have taken physics as their model, and assumed that any characterisation 
of theorising here should apply equally-well to other sciences, which in any case 
were expected to become more and more like physics. As Mayr mentions, until 
quite recently biologists were forced to choose between regarding their disci- 
pline as a temporary makeshift ultimately reducible to physics, or appealing to 
an unverifiable and probably unscientific ‘entelechy’ or ‘dlan vital. Only in the 
latter quarter of the nineteenth century did biology have its fling as a source of 
more general scientific ideas. But many people now think that biology is the 
more exciting science, on the verge of significant breakthroughs, and philosophers 
should see how it is actually operating. 

Goodfield points out that it is typical in physics to postulate the existence of 
some hypothetical entity if it contributes to an explanatory scheme, even though 
there may be no hope at the time for observing it directly. This has proved | 
successful right down to elementary particle theory today. But by a careful’ 
analysis of the concepts of ‘gene’, ‘organiser’, ‘messenger RNA’, and ‘repressor’, 
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all of which would seem to qualify originally as theoretical entities, she shows 
that biological research does not really fit this model. This is bound up with the 
problem of functional explanations and the fact that concepts like ‘gene’ repre- 
sent units of function as well as simply units of structure. In addition, the pro- 
blem of finding the mechanism whereby a single fertilised egg becomes differ- 
entiated into a complex organism, or information coded in the DNA leads to the 
development of unique individuals, seems different from the problem of finding 
universal causal laws in physics that might be expressed in mathematical 
equations. Although the success of the gene theory may provide an exception, 
Goodfield concludes that explanation in terms of hypothetical units is unlikely in 
biology, both because the existence of the units is rarely questionable (they can 
be observed before their function is known), and because the concreteness, 
relative complexity, and variability within a species of biological organisms 
makes it unlikely that such hypothetical entities can lead to the expression of 
simple and universal mathematical laws. Without prejudging the ultimate 
reducibility of biology to physics or anything else, it would seem more worth- 
while for philosophers to concentrate on those features that make biology 
unique as a science, both in terms of method and of subject matter. 


JOHN C. GRAVES 
Massachusetts Institute of Technology 


Bar-HiLLEL, Y. [1970]: Aspects of Language. Jerusalem: Magnes Press; Amster- 
dam: North-Holland Publishing Company. £6.80. Pp. 381. 


Professor Bar-Hillel is a learned and perceptive logician and philosopher of 
language. A previous collection, Language and Information, embraced his more 
technical linguistic papers, and one might hope to learn much from the present 
collection of his more philosophical papers about language. Indeed one can; but 
the overall effect is somewhat disappointing. There is less than one would have 
expected that one can get one’s teeth into, either for nourishment or for 
controversy. 

Four historical papers should be excepted from this criticism, of which two 
expound parts of Bolzano’s logic, one discusses leading themes in Carnap’s 
Logical Syntax of Language, and a slighter one some of Husserls ideas, Of the 
other thirty papers only seven deal mainly with substantive issues in logic or 
linguistics (Chapters 2, 5, 21, 23, 24, 26 and 29), seven are mainly programmatic, 
pleading that language should be studied (or philosophy discussed) in one way or 
not in another (Chapters 16, 17, 22, 25, 27, 32, and 33), while sixteen are mainly 
reviews and criticisms, This division is necessarily very rough. The program- 
matic recommendations are backed by substantive illustrations, and the critical 
pieces show how many writers have gone wrong through neglect of certain 
principles. But many of the faults for which the delinquents are rebuked are ones 
that we already know to be faults, though such knowledge does not always 
-prevent us from lapsing into them. Some sadistic pleasure may accrue from see- 
ing Geach convicted of careless reading, Ayer and Perelman of inadequate 
preparation and looseness of reference, Bernard Williams and many others of 
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failure to pay due attention to the distinctions between acts of uttering, sentences, 
and statements. But fairly few positive points emerge from these criticisms, and 
the same points are made many times. Bar-Hillel himself remarks (p. 113) that 
‘nobody is particularly interested in who was wrong when and where’; but, if so, 
some of the more negatively critical essays could well have been omitted. 
(While on negative criticism myself I may mention two small but irritating 
faults in presentation: though the chapters are numbered in the table of contents, 
there is no number at the head of each chapter, and the date of each paper is 
given neither in the table of contents nor with the paper itself, but has to be dug 
out from another table at the back of the book. Since these dates range from 
1946 to 1970, and since both Bar-Hillel’s views and the context to which they are 
relevant have naturally changed somewhat over this period, it would have been 
better if the date had been shown at the beginning of each paper rather than left 
to be inferred from the footnote references to other work.) 

Front line experience has left Bar-Hillel (no doubt rightly) disillusioned about 
any union between cybernetics and linguistics (Chapter 25). His main program- 
matic plea is for a Dretecksverhdlints between Oxford-style ordinary language 
philosophy, the linguistic constructionism of Carnap, and the systematic 
linguistic theory of Chomsky and his associates. More generally, it is that logic- 
ians and linguists should acquaint themselves with each others’ work, and should 
not leave, for example, aspects of semantics that call for logical formulation 
(e.g. hyponymy, comparatives, and the relation between such pairs of words as 
‘buy’ and ‘sell’) in a no man’s land between isolated disciplines (Introduction, 
PP. 152-3, 354-63, etc.). This does seem to be the way to advance the understand- 
ing of language, and there are some indications that what Bar-Hillel would like 
to see is coming about, perhaps partly as a result of his preaching. An equally 
reasonable plea which is much further from being met is that logicians should 
study seriously the validity of arguments in natural—and more generally in 
pragmatic, context-dependent—languages (Chapters 16, 17, and 32). 

Language is a suitable object for scientific study, as interesting as and perhaps 
more important than, say, lunar rocks. But is there any reason why it should be 
of special interest to philosophers? Logical positivists and ordinary language 
philosophers used to think that attention to meaning or use would somehow 
solve or dissolve the great traditional philosophical questions. Some of this hope 
survives in Bar-Hillel’s essays. In 1954 at least he was prepared to defend 
Carnap’s view that controversial philosophical theses are clarified by translation 
from the ‘material’ into the ‘formal’ mode of speech, by moving, for example, 
from statements about what numbers are to suggestions that it is fruitful and 
expedient to work with language-systems in which numerical expressions play 
certain roles. “The decisive advantage is in the transition from “ontological” 
disputes to methodological controversies’ (Chapter 11, especially pp. 134-5). 
And in 1964 he argued that Grover Maxwell, in defending realism against 
positivism with regard to theoretical entities, was concerned with a pseudo-issue. 
I cannot discuss this problem at any length, but can only express the opinion 
that a general shift from the material to the formal mode, from ontology to 
methodology, is mistaken. Where it seems appropriate, this is because some onto- 
logical issue could rightly be settled in the negative. It is because there are no 
number-entities that the important question is that about the convenient role for ° 
number-expressions; but despite Carnap and Bar-Hillel (and Nagel), realism 
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about some theoretical entities has a fair chance of being true, and is not under- 
mined by the distinction between theory-dependent and (relatively) theory- 
independent expressions. It would take too long to argue this, but it is easy to 
show that some of Bar-Hillel’s contrary arguments are unsound. He says (pp. 
266~7): ‘Since the only valid reason for believing in the real existence of certain 
theoretical entities is, to use Maxwell’s language, the fact that the theories 
referring to them are well-substantiated or “work” . . . it is obviously arguing in 
a circle to look upon the existence of these entities as an explanation for the fact 
that the theories ‘‘work” as well as they do.’ In fact, it is obviously not arguing in 
a circle. Try a parallel case. If the only reason for believing in the real existence 
of an exemplar from which manuscripts A and B (but not C and D) were copied 
is the agreements in error between A and B not shared by C or D, is it arguing in 
a circle to look upon the existence of this exemplar as an explanation of those 
agreements in error? Certainly not. Though it would be to use the existence of 
this hypothetical exemplar as a ground for believing that the readings in which 
A and B distinctively agreed were errors. Bar-Hillel has been misled by his own 
assumption that a claim that certain theoretical entities really exist can only be a 
‘transposed’ version of the statement that certain theories ‘work’. Maxwell’s 
explanation would become circular if we thus read into it the conclusion for 
which Bar-Hillel is arguing. But then it is Bar-Hillel, not Maxwell, who is really 
arguing in a circle. Anyone who wants to advance the study of the validity of 
arguments in natural language will need to do better than this. 

Bar-Hillel has, at least since 1954, usefully stressed the importance of indexical 
expressions and has exposed errors in arguments that involve them (Chapters 5, 
7, 10, and 12), some of which (e.g. Descartes’s cogito) are of great philosophical 
interest. Along these lines he has offered different solutions at different times of 
the Liar Paradox (Chapters 5, 19, 20, and 24) and hints in the Introduction that 
there may be more to come. It is to be hoped that there are, for even his latest 
treatment is far from complete. All his treatments rest on the utterance/sentence/ 
statement distinctions: apparently-paradoxical utterances can be shown to have 
made no statement. But is it satisfactory to say (p. 283) that if a blackboard has 
written on it only “The sentence written on this blackboard is false’ we can decide 
that no statement is thereby made (since the assumption that one is made leads to 
a contradiction) whereas if ‘false’ were replaced by ‘true’ we could not decide 
whether a statement is made or not? Bar-Hillel says that ‘recognizing the . . . non- 
existence of a mechanical procedure for deciding whether, given an utterance, a 
statement was made by it, is . . . part of the price that has to be paid for keeping 
our natural languages consistent.’ Mechanical procedure, yes: but surely informal 
reflection shows that in the case of ‘true’ also no statement has been made. Even 
more serious is the fact that there are variants of the Liar which these dis- 
tinctions, in themselves, do not solve: e.g. “There is no sentence written on this 
blackboard which, standardly construed, makes a true statement.’ About this we 
cannot say that ‘the supposedly paradoxical situations can be shown to evaporate 
by realizing that in these situations no statements had been made at all’ (p. 285); 
saying this would commit us to saying that this inscription, standardly construed, 
makes a true statement. I believe, none the less, that this approach is basically 
correct, but it needs to be taken further than Bar-Hillel has taken it. (I have 
"myself attempted to do so in Chapter 6 of Truth, Probability, and Paradox.) 

This book is, then, an interesting personal record, containing what are in the 
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main sound criticisms of prevalent errors, sensible recommendations for the 

study of language, and some scholarly contributions to the history of logic. But 

the substantive contributions to philosophy do not go very far, and at least one 

central and repeatedly employed philosophical doctrine seems to me to be 
mistaken. 

J. L. MACKIE 

University College, Oxford 


ANDRESKI, S. [1972]: The Social Sciences as Sorcery. London: André Deutsch. 
£2.95. Pp. 238. 


` x At the risk of oversimplifying, one could present Professor Andreski’s view 
of the situation and its problematics as follows. The social sciences have under- 
gone a vast expansion in many countries of the world, where social problems 
have given rise to more jobs and more publications. Most of what is published in 
these fields is not much good, and the social problems of those same countries 
which foster the expansion have a suspicious tendency to get worse, not better. 
The output of the social sciences consists of ‘an abundance of pompous bluff and 
a paucity of new ideas’ (p. 11), while ‘95% of research is indeed re-search for 
things that have been found long ago and many times since’ (p. 11). Andreski 
sums up his thesis rather well in this quotation: 


More than that of his colleagues in the natural sciences, the position of an 
“expert” in the study of human behaviour resembles that of a sorcerer who 
can make the crops come up or the rain fall by uttering an incantation. 
And because the facts with which he deals are seldom verifiable, his cus- 
tomers are able to demand to be told what they like to hear, and will punish 
the unco-operative soothsayer who insists on saying what they would rather 
not know—as the princes used to punish the court physicians for failing to 
cure them. Moreover, as people want to achieve their ends by influencing 
others, they will always try to cajole, bully or bribe the witch-doctor into 
using his powers for their benefit and uttering the needed incantation ... to 
allay his gnawing doubts, anxieties and guilt, he is compelled to take the line 
of least resistance by spinning more and more intricate webs of fiction and 
falsehood, while paying ever more ardent lip-service to the ideals of object- 
ivity and the pursuit of truth (p. 24). 


All of this needs some explanation, and Andreski’s attempts seem to go along 
two lines: intellectual and sociological. Intellectually, he argues that the social 
sciences face a complex subject matter, demanding a sensitivity, humanity and 
intelligence rare in our academic population. No clear-cut methodological 
standards exist and so the field becomes ripe for sorcery verging on charlatanry: 


To cut a long story short, scientific method has triumphed throughout the 
world because it bestowed upon those who practised it power over those who 
did not. Sorcery lost; not because of any waning of its intrinsic appeal to the . 
human mind, but because it failed to match the power created by science 


ot 


194 Reviews 


... incantations remain more effective for manipulating crowds than logical 
arguments, so that in the conduct of human affairs sorcery continues to be 
stronger than science (p. 92). 


Hard to do, hard to verify, short on results, the social sciences also have such 
great difficulty in being objective and neutral that some of their practitioners 
abandon the search for truth altogether. How difficult and demanding the social 
sciences are becomes clear when we watch intelligent natural scientists wander 
into them and start blundering about (p. 199). 

Andreski’s sociological arguments in explanation of the state of the social 
sciences can be classified into two groups: The State of Our Civilization and The 
Sociology of Knowledge. The (deplorable) State of Our Civilization: in ‘the 
advanced state of cretinization . . . reached under the impact of the mass media’ 
(p. 17), of ‘passive and dull telly gapers without other ideals than stultifying 
conformity to the norms of consumer mentality’ (p. 224), many would rather die 
than think. Consequently, ‘social and political studies have opened the gates of 
academic pastures to a large number of aspirants to the status of a scientist who 
might have been perfectly useful citizens as post-office managers or hospital 
almoners, but who have been tempted into charlatanry by being faced with a 
subject utterly beyond their mental powers’ (p. 204). 

How has this come about? The answer lies in the Sociology of Knowledge. 
Mass production for the mass society has ‘an intrinsic tendency to bring every- 
thing down (though also up) to the average’ (p. 43), and, since few can appreciate, 
still less contribute to, higher culture, expansion of universities and publishing 
simply confirms the law that more means worse. America has more social scientists 
than any other country; likewise, more drug addiction, crime, divorce, race riots. 
‘American schools are the least efficient in the world .. . in no other country can 
you become a professor at a top university without having first to learn how to 
write competently’ (p. 27). In France, too, may it be noted, the ‘collapse’ of the 
educational system was preceded by a rapid expansion of the social sciences (p. 
28). 

Expansion, professionalization, and an influx of research money corrupted the 
search for truth. The expansion made it possible for intellectual mediocrities to 
enter the social sciences. Professionalization gave them a chance to rise because 
administrative ability and intrigue are more important than scholarship for suc- 
cess in universities and research teams. And money spread corruption as its 
grateful recipients found it convenient to adopt doctrines that would never result 
in their biting the hand that fed them. Results: universities everywhere are less 
free and vigorous than they were in 1900 (pp. 219-22); students see one dis- 
tinguished professor to be a ‘disingenuous windbag’ and soon all the others lose 
authority too. Small wonder, then, that the students turn to Marx, who at least 
concerned himself with real issues, expressed himself clearly and directly, and 
had some bold ideas. For, we do have genuine social problems to which Marx 
offered some interesting (if no longer acceptable) solutions. These problems 
remain urgent; indeed our civilization will collapse if they are not solved, (p. 
232). 4 

The above account of Andreski’s book does not follow the order in which 
. topics are treated in the text. Various of its chapters argue that jargon is used as a 
smokescreen to hide vacuity; that functionalism is a conservative doctrine; that 
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quantification and cybernetics are virtually nothing more than make-works; and 
s0 on. 


2 In view of what follows, I want to draw attention to some merits of the book 
which should not be overlooked. It is lively and spirited in tone, with some good 
jokes; and it probably took courage to publish. At the very beginning, Andreski 
faces this when he asks, why foul one’s nest? His answer is that those prepared to 
think are always few, but that standards cannot continue unless some people are 
prepared to defend them. He doubts whether his ‘exhortation’ and ‘preaching’ 
will help much, but comforts himself with the thought that knowledge is cumu- 
lative and truth will out. The real achievements of the book are the devastating 
critique of various ‘theories’ of development and modernization (pp. 63-76); the 
dismemberment of Lévy-Strauss (pp. 83-5, 131—6); the cutting down to size of 
game-simulations of international relations (pp. 118-21); and, long overdue, his 
calling the bluff of Herbert Simon’s Models of Man (pp. 128-30) and of mathe- 
matical economics (pp. 140-3): 


To give a physical-analogy of what passes for mathematical sociology, we 
would have to put into a mathematical formula statements like ‘if you bang 
them hard enough, most things crack up’. Actually this is being charitable 
to Simon because the last statement, though exceedingly vague, is at least 
true. A better physical equivalent of Simon’s formalization of Homans’ 
theories would be a sentence like ‘the wind bloweth where it listeth’; which 
could also be written in the mathematical symbolism of vector calculus: 
Uy— Vr (Pp. 129). 

In summarising his main argument, however, I confess to having had some 
difficulty reconciling the various positions Andreski from time to time takes up. 
He deplores the state of the expanded universities, but he also points out that in 
the past original thinkers were sometimes kept out: Einstein from graduate 
studies, Newton from early attempts to secure a Cambridge fellowship, Galois 
failing an entrance examination, and so on. Andreski also maintains that once 
inside a university, ‘teaching is bad for the brain—because, speaking habitually 
to a captive audience of one’s mental inferiors, one easily falls into the habit of 
perorating rather than thinking and examining critically one’s opinions’ (p. 215). 
Is the university necessary to, or even good for, intellectual endeavour? Andreski 
seems undecided, 

Andreski piles on examples of trivia and gobbledygook ad lib., but rarely 
cites any good stuff against which to measure the bad. Indeed, only the classics of 
sociology, Erving Goffman, David Riesman and his own works are cited with 
approval. (He notices this and offers the excuse that he has done it elsewhere and 
he cannot deal with everything at once). When playing the usual game of trans- 
lating Parsons into English, Andreski fails to mention that both Sorokin and 
Mills? did this previously (and better), and that really everything (worth saying) 
has already been said in the book edited by Max Black.* So Andreski himself is 
not immune to research into what was discovered (not so) long ago. (Incidentally, 
at pp. 166 and pp. 172-3 Andreski misunderstands rather’ than translates 
Parsonsese.) 


1 Sorokin [1956] and Mills [1960]. * Black [1961]. 
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There is also the matter of the intermittent outbursts against the mass media. 
We get no empirical evidence against them, no rational argument about them, no 
explanation of why the outbursts are included in this volume at all, or of which 
readers will learn anything new from being subjected to them. 

But these are minor matters. There is a practical proposal to consider: 
Andreski suggests two tests which can be applied to jargon-ridden works of 
social science. One test is to translate it into simpler language and see if it says 
anything; no one can object to that—not in an anyway flooded market, at least. 
The other is to test yourself to see whether you are too stupid to grasp the point, 
too stupid to do high-powered theoretical work. This test should gratify readers of 
this Journal: 

Test your brain power on texts falling within a field where there is little 
room for bluff, and which are intellectually demanding without requiring 
extensive specialist knowledge: namely the less technical books on the 
philosophy of the natural sciences, such as P. W. Bridgman’s Logic of 
Modern Physics, or Rudolf Carnap’s Philosophical Foundations of Physics, or 
Bertrand Russell’s Introduction to Mathematical Philosophy, or J. H. 
Woodger’s Biological Principles—to mention just a few among many eligible 
titles. Now, if despite serious effort . . . you cannot understand them, then 
keep away from high-powered theories and do not attempt to produce 
anything very abstract yourself. Be honest and adjust your aims to your 
abilities. There are many areas of sociology, anthropology, political science, 
psychology and economics where useful work can be done without recourse 
to high-powered abstractions ... where common sense... suffices. How- 
ever, if you have mastered a number of books such as those just mentioned, 
and still cannot understand what some... luminary has written or said, 
then you can... suspect that it might all be nonsense. (pp. 85-6). 


Is this as clear and practical as it looks? Is it not a trifle pretentious in a book 
castigating pretentiousness? The time-honoured high-powered methodological 
technique in such cases is to apply the passage to itself. The reader who wishes to 
test himself—or this reviewer—should carry out the exercise first and then read 
on. 
To begin with, is the quoted passage down-to-earth commonsense, or high- 
powered abstraction? Let us consider each alternative separately. It articulates 
the commonsense theory that we should suit our aims to our abilities. This 
doctrine is rather conservative, since it assumes our abilities are fixed in relation 
to our aims and that we know what the limits of our abilities are. But how can we 
know this until we have aimed high, higher perhaps than we thought we could 
manage? And what if we then achieve our aim? Perhaps we should develop our 
abilities to suit our aims. Whether we should suit our aims to our abilities anyway 
depends upon whether we can suit our aims to our abilities; for, if ought implies 
can, cannot implies not-ought. If we cannot know what our abilities are until we 
have tested them by aiming high, it follows that it is not the case that we ought to 
suit our aims to our abilities. ; 

Is it commonsense to invoke the philosophy of the natural sciences as an 
exemplar of an intellectually taxing, non-specialist, bluff-free test of brain power? 
, Time was when the traditional formal logic was invoked in this way; and some 
people still cite a classical education as such a hurdle course. We surely cannot 
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leave out the I.Q, and other psychological tests which try to estimate brain power. 
Be all this as it may, I wonder whether Montesquieu, Comte, Tocqueville 
Smith, Malthus, Ricardo, Mill, Marx, Spencer, Morgan, Tylor, Frazer, Durk- 
heim, Pareto, Weber, Simmel, Malinowski and Radcliffe-Brown would have 
passed Andreski’s test; or, rather, have been discouraged by failure. 

As a transition between commonsense and high-power, let us consider the 
selection of books themselves. To a philosopher of science they would look oddly 
assorted, mastery of all of them indicating very little. One is a pot-boiler; 
transcribed lectures by a philosopher who himself bemoaned his lack of scientific 
knowledge (the Carnap); another is a book with only one simple idea—the 
meaning of a term is the operation used to measure it (temperature means 
thermometer reading}—which is stretched out over several chapters (the 
Bridgman); yet another is an elementary introduction to meta-mathematics by 
one of the founders of the subject, who wrote it to while away the hours in jail 
(the Russell); and the last is an attempt to translate some basic biological ideas 
into a formal logical language and demands some logical sophistication in the 
reader (the Woodger). In view of all this perhaps even the man of commonsense 
will see that this is an odd array, and no clear test of anything, let alone brain 
power and ability to do abstract theoretical work. 

Ts the passage then high-powered rather than commonsense? Could there be a 
counter-example to it, someone who has mastered these books yet produced 
nonsense? Andreski himself provides some when he dismisses Bridgman’s 
political pronouncements as ill-conceived and presumptuous, and also those of 
J. D. Bernal, who doubtless knows the books, yet who exhibits ‘the mentality of a 
marxist backwoodaman’ (p. 199). However, it occurred to me that the duo of 
Wittgenstein and Russell is a clincher. Both would be acknowledged as well able 
to master the books. Both were very anti-nonsense. How odd then that Russell 
denounced Wittgenstein’s later philosophy as nonsense, and that Wittgenstein 
exhibited a similar attitude to the kind of philosophy both he and Russell had at 
one time indulged in. 

‘This practical test, then, is neither necessary to good work (vide the classical 
thinkers), nor is it sufficient for good work (vide Bernal, Bridgman, Russell and 
Wittgenstein) in the social sciences. 

Andreski’s proposal, then, for testing our brain power and discriminating 
between sense and nonsense is difficult to understand. Is this because of the 
reviewer’s lack of brain power, or because the text is a nonsense? We can’t use 
the original test to solve this problem, as that would beg the question. So now we 
need a meta-Andreski to provide us with a test applicable to Andreski’s text. 
But this, being by Andreski, will simply raise the problem again. And so on, 
ad infinitum. 


3 The reviewer feels an apology is in order for the above hard-nosed exercise; 
his excuse is that Andreski asks for reviewers’ credentials in a booby-trap set at 
Pp. 49, which will now be detonated: 


never assume without good evidence that the reviewer knows better than 
the author. Admittedly, in a field without firm standards the odds are that 
the book is pretty awful, but it is equally likely that the reviewer is either 
too ill-informed to understand what it is all about, or too lazy to read the 
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text on which he is passing a verdict, or too timorous to produce anything 
himself and consequently eager to assuage his envy by denigration, if he is 
not simply playing clique politics. 


Despite this, the reviewer now makes so bold as to criticise the book—in toto— 
with a number of straightforward arguments suggested by a careful reading. As 
an essential minimum, a debunking book should not exemplify what it debunks. 
Yet this one does, As I shall try briefly to indicate, while it debunks much sociol- 
ogy in general, and the methods sociologists employ in particular as pretentious, 
its sociology is pretty weak and its methodology not exactly self-effacing. 

(a) It has been done better before—by Sorokin, Mills and Black; one could also 
mention Hayek and Popper}. Each of these has a theory about the state of the 
social sciences: Sorokin about the winds of intellectual fashion and fad, already 
detected in America by Tocqueville; Mills about the disguised political content 
in the pretence of impartial sociology; Black about the sociologist’s bewitchment 
by language; Hayek about the scientism of eighteenth and nineteenth century 
social science; Popper about the attractions and dangers of historicism. Andreski’s 
suggestions do not add up to a theory, unless you want to call his application of 
Gresham’s law to ideas a theory. But then as a theory it happens to be obviously 
false: implausible sociologically and psychologically. Giving ideas away is not 
like giving money away. Moreover, all the works mentioned above, while they 
are polemical, do not come over as eccentrically opinionated. Here is a more 
commonsense-sounding version of Gresham’s law applied to ideas: a good 
preacher sticks to the main point and does not gratuitously alienate prospective 
converts by airing his totally irrelevant prejudices on peripheral topics. 

(b) It offers no explanation. Andreski’s appeal to an alleged conspiracy of self- 
serving charlatans grappling with ideas too difficult for them is hardly an ex- 
planation. Many find Talcott Parson’s work interesting and worth studying, and 
the difficulty of his style not worth bothering about. If they are all deluded (as 
Andreski hints they are, for on p. 203 he argues that the wealthiest American 
universities contain proportionately more phonies because academic success 
demands skill at intrigue and self-advertisement), we should want to know why 
Parsons deludes them more successfully than other sociologists of the same ilk; 
i.e. we want a more intellectually satisfying explanation of his success. If there is 
some intellectual merit to Parsons then surely we want a more intellectually 
satisfying explanation of his success. Here, then, Andreski feigns a sociological 
explanation which is sociologically and methodologically poor. 

. (c) It offers no solution. Apart from suggestions to restrict funds, and to get 
down to patient hard work without benefit of great breakthroughs, no other sug- 
gestions are offered. The unintended consequences of the adoption of either 
proposal are never even considered. Yet sociology is about unintended conse- 
quences. Much of the social science verbiage Andreski is trying to eliminate with 
his two proposals is quite harmless. Like poetry or chess it has its satisfactions. 
Even a subject like mathematical economics—which Andreski and this reviewer 
would agree to be sometimes rather pretentious—can be beneficial (pp. 140-3). 
Those who are serious and who want to get on with serious thinking will do so, 
as they always have, undisturbed by money going to the phoney. Here again, 


1 Hayek [1952] and Popper [1957]. 
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then, the author’s sociology and methodology, if one wants to call them that, are 
somewhat inadequate. 

(d) It will strengthen opposition to all social science, good and bad. A propos 
of student unrest Andreski says, ‘once certain top professors came to be regard- 
ed as disingenuous windbags, the status of all their colleagues was automatically 
reduced’ (italics mine; pp. 226-7). The author might, perhaps even should, 
read this sentence again and ask himself whether this is not something he 
encourages. Perhaps he thinks those who denounce windbags will be exempt. 
The history of the collapse of cultures and civilisations—a subject which interests 
Andreski—proves otherwise. This, of course, is not to say one should not criticise 
the social sciences; only that debunking the windbag is unproductive and likely 
to lash back. 

What I particularly like in this book is its concern for students and understand- 
ing of their unrest, its good word for Marx and for C. Northcote Parkinson, its 
attempt to be practical, commonsense, and unsmug. So it is depressing to report 
that there is little in the volume which might bait the better students to enter the 
profession and become social scientists. Someone with Andreski’s views is ill- 
advised to go in for blanket condemnations of the subject and its practitioners, 
with quotations plucked from here there and everywhere. If he—a sociologist of 
considerable standing—fails to treat the social sciences seriously how can we 
ever get anyone else to? The way to foster better social science is perhaps to 
point to and analyse good examples, discuss the criteria which they exemplify, 
and which others fail to, and encourage thereby the growth of social science. 
Failure to stress and explain good work in the social sciences, then, by a debunker, 
is poor sociology and poor methodology—especially poor methodology since it is 
both dogmatic and self-defeating. 


I. C. JARVIE 
York University, Toronto 
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I RECENT CRITICISM OF THE THEORY 

No other part of orthodox Anglo-American statistical theory has been the 
object of more criticism during the past decade than the Neyman-Pearson 
theory of testing statistical hypotheses (henceforth labelled by ‘NPT’). 
Much of the criticism is concerned with the abuses of the theory—or 
rather, with the hybrid versions of it in current use—in the social and bio- 
medical sciences, and is thus not really directed at the theory itself. The 
more theoretical attacks do not qualify as refutations because they are 
based on views of statistical inference that are fundamentally alien to 
NPT.1 Of course, this is not to say that they are of no value or that they 
do not raise serious difficulties for the theory.? In particular, Hacking’s 
Received 8 November 1972 


11 regard Wald’s sequential analysis, his generalised decision-theoretic approach to 
statistical inference and all the developments other than Savage’s Bayesian approach 
spawned by it as being refinements of NPT, rather than as competing theories. 

2 A recent paper by Donald Gillies in this Journal (Gillies [1971]) is noteworthy in this 
respect. Gillies effectively argues that on Popper’s approach to problems of testing, 
hypotheses may be tested independently of the formulation of an alternative hypothesis, 
and that the only alternative hypotheses that advocates of NPT have been able to con- 
struct for such garden variety problems as goodness of fit are artificial and ad hoc, 
However, I think it unlikely that any NPT theorist will be convinced by Gillies’s argu- 
mients because of the wide disparity between Gillies’s view of the nature and purpose of 
testing and NPT’s. When we turn from the logic of testing to the question of the applica- 
bility of NPT, it might seem that Gillies and an NPT theorist can argue on common 
ground and that Gillies carries the day. But it seems clear to me that even this issue is 
not properly dissociated from the question of the logic of testing. g 
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arguments in his [1965] are based on a principle close enough in spirit to 
Neyman’s and Pearson’s early likelihood principle that they should 
provoke the concern of the theory’s advocates. Hacking’s major point is 
that after the experiment a Neyman-Pearson test calls for is performed, 
the size and power of the test are irrelevant to an appraisal of the soundness 
of the decision the test dictates. He tried to show that low size and high 
power are not intrinsically desirable properties of tests from the ‘post- 
trial’ point of view researchers ought to adopt when evaluating a test. 
Hacking failed to clinch this point because the examples he used are highly 
artificial and, as he admits, utilise tests that NPT would not endorse 
(with one very odd exception). I believe that Hacking’s line of argumenta- 
tion is essentially sound within the conceptual framework of NPT, and 
that it can be reformulated so as to constitute a refutation of the theory. 
Specifically, I intend to show, by means of two fairly realistic examples 
used by Neyman in his 1950 textbook to illustrate good ‘best’ tests, that 
the size and power of an NP test are not truly relevant indices of the degree 
of security one should have in using the test; that size and power are 
dangerously misleading concepts; that good ‘best’ NP tests may lead to 
decisions that are clearly stupid in light of the outcome of the experiment 
and the hypothetical probability distributions that govern it—all this in 
terms of principles built into the theory. In short, I intend to show that 
NPT is inadequate on its own terms. 


2 THE CONCEPTUAL FRAMEWORK OF NPT 


In order to prove my case, I must first outline the main ideas from which 
the mathematics of NPT flow on the mature version of the theory. 
Neyman denies that mathematical statistics forms the ‘basis of some 
mental process described as ‘inductive reasoning” ’,? and holds that 
‘in spite of the substantial literature on the subject, the term “inductive 
reasoning” remains obscure and it is uncertain whether or not the term 
can be conveniently used to denote any clearly defined concept’.2 NPT 
is based on the view that it is impossible to construct a scientifically 
respectable theory of testing that utilises a notion of the credibility of an 
hypothesis in the light of experimental data, Neyman has little use for the 
1] rely rather heavily on Neyman’s writings in this and other sections. The early, joint 
papers of Neyman and Pearson are mostly concerned with the mathematical aspects of 
the theory, and one gets the impression from his biographical notes in Pearson and Kendall 
[1970] and Barnard and Cox [1962] that Pearson did not concern himself with the 
philosophical aspects of NPT after the publication of these papers. Moreover, in these 
notes Pearson gives the impression that he agrees with Neyman’s systematic account of 
the concept of inductive behaviour and its relation to problems of testing. However, I 


would have no objection to my paper’s being regarded as a cřitique of Neyman’s version 
` of NPT. 2 Neyman [1950], p. 1. 3 Ibid. 
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phrase ‘degree of evidential support’. He has proposed that the theory of 
testing (and that of estimation as well) be based on what he labels ‘the 
concept of inductive behavior’, which he defines as a concept of the 
‘adjustment of our behavior to limited amounts of observation’.+ 

The content of the concept of inductive behavior is recognition that the 
purpose of every piece of serious research is to provide grounds for the selection 
of several contemplated courses of action. Also the recognition that the desir- 
ability of this or that course of action depends upon the circumstances and, of 
course, on the subjective preferences and beliefs of the individual concerned.? 


The import of the above quote is that the theory of testing should be 
viewed as a theory of prudential decision-making. According to NPT, 
problems of testing involve two contemplated courses of action, one which 
is the best to take if some statistical hypothesis H is true, and another 
which is the best to take if some hypothesis H’ is true. The former course 
of action is labelled ‘Accept H and reject H” and the latter, ‘Reject H 
and accept H”. Neyman repeatedly cautions students of NPT not to 
read into these labels the adoption of an epistemic attitude towards 
hypotheses, although the adoption of an epistemic attitude (if indeed one 
can decide to believe) is not ruled out as a possible action in certain 
problems. 

The problem NPT addresses itself to is that of formulating an experi- 
ment the decision-maker (or his assistants) is to perform and a rule of 
inductive behaviour which assigns an action to be taken for each possible 
outcome of the experiment such that each action is prudentially justified 
for the decision-maker relative to his ‘subjective preferences and beliefs’. 
Such a rule—experiment combination is to be called a ‘good test’. NPT 
provides mathematical criteria of good tests and mathematical techniques 
for constructing such tests. As all familiar with the theory know, the con- 
cepts of the size and power function of a test are the key concepts involved 
in this technical part of the theory. 


3 NEYMAN’S TUBERCULOSIS SCREENING EXAMPLE 

One of Neyman’s examples of a good NPT test? involves the problem of 
formulating a diagnostic procedure to be used by a clinic in connection 
with a general medical check-up. The diagnosis of the presence or absence 
of active tuberculosis is to be based on several X-rays of the patient’s 
chest, delivered one at a time in random order along with those of other 
patients to a radiologist who is to label each picture ‘positive’ or ‘negative’. 
The pictures are not numbered or described in such a way that the radio- 
logist can identify the patients they are of, to insure that the diagnosis he 
1 Ibid. 2? Neyman [1957], p. 16. 3 Neyman [1950], pp. 268-71. ° 
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affixes to each picture is independent of the diagnosis affixed to other 
pictures of the patient. The clinic assumes (presumably on the basis of 
experience) that if a patient has no trace of active TB the probability that 
the radiologist will label a single X-ray ‘positive’ is o-o1, and that if a 
patient is moderately affected, the probability is 0-60. Under the inde- 
pendence assumption and the assumptions just described, the hypothesis 
that patient Y is free of the disease is equivalent to the statistical hypo- 
thesis HY that the labelling of Y’s X-rays is a Bernoulli process with a 
probability of oor of ‘success’ on each trial. Similarly, the hypothesis 
that patient Y is moderately affected by the disease is equivalent to the 
statistical hypothesis H% that the labelling of Y’s X-rays is a Bernoulli 
process with a o-60 probability of ‘success’ on each trial. After n X-rays 
of a patient’s chest are read by the radiologist a final diagnosis is to be 
made on the basis of the number k of ‘positive’ labels among the n pictures, 
Neyman assumes that the most serious error the clinic could make would 
be to give an erroneous negative final diagnosis (reject H¥ if it is true), 
hence in conformity with NPT, H? is designated ‘the hypothesis tested’ 
and H7 ‘the alternative hypothesis’. It is assumed that one of these two 
hypotheses is true, presumably on the grounds that a person heavily 
affected with TB would not be undergoing a routine check-up. Neyman 
proposes that the following would be a good test for the clinic to use: 
Take five X-rays of a patients chest and give a negative final diagnosis, 
i.e. eject H, in favour of H, if and only if k = o. This judgment is supported 
by consideration of the hypothetical probability distributions of the six 
possible outcomes of the experiment given in Table I below. 





TABLE I 
k= o I 2 3 4 5 
H? O-010 0-077 0:234 0:346 0:259 0-078 
zero for 
HY O-O5I 00479 O-OOI O00001 practical purposes 


As can be seen from this table the probability of rejecting the hypothesis 
tested if it is true is o-o1—the size of the test—and the probability of 
rejecting the hypothesis tested if it is false—the power of the test--is 
0-951. Moreover, there is no other test based on five X-rays that has an 
equal or smaller size and greater or equal power, hence this test is uniquely 

“‘best’ for this sort of experiment. Although many statisticians regard low 
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error probabilities (the probability accepting the alternative if it is false 
is one minus the power, or 0-049 for the present example) as intrinsically 
desirable, Neyman and Pearson felt obliged to argue that tests with low 
size ‘and high power are good. They provided a long run interpretation 
of error probabilities which for the present example goes as follows!: 

The interpretation [of the error probabilities of the test] in operational terms 
is as follows: If all the assumptions made are approximately true, and if 
the described procedure of multiple X-ray examinations is used by the clinic, 
then the final diagnosis based on five X-ray examinations will be “negative” 
for about one percent of all tuberculosis patients and about ninety five percent 
of all those who have not contracted the disease. 


I take the point of this quote to be that the application of a test in a 
particular case is to be justified by considering what its performance would 
be for a long run of similar cases. Neyman and Pearson never precisely 
worked out all the details of this sort of long run success interpretation of 
size and power. Their writings contain many references to it, and it is 
fair to hold that they assumed that it can in principle always be provided 
for each testing problem and is the ultimate justification for basing decisions 
to accept or reject hypotheses in individual cases on considerations of 
size and power. 


4 SIZE, POWER AND LONG RUN 8UCCESS 

For the TB example the long run interpretation of size and power is quite 
natural, since we assume that the same test is to be applied to many 
similar individuals. However, for testing problems of a non-repetitive 
nature one has to stretch one’s imagination to great lengths in order to 
find a relevantly similar class of testing problems. Moreover, even if there 
is a natural class of similar problems the appropriate operational interpreta- 
tion does not trivially follow from the nature of the hypotheses under 
test. For example, all that follows from H¥ and H7 is that if individual Y 
were repeatedly and independently tested, in the long run about 1 per cent of 
all diagnoses would be negative if Y has TB, and gs per cent if he does not. 
(It is worth noting that Abraham Wald chose this sort of consequence as 
the basis of his justification of tests with low size and high power. His 
approach will be discussed in the last section.) It would appear, then, 
that the conceptual framework of NPT as developed by its originators 
must incorporate a presupposition something like the following: For 
every test T there is a large open class R of test problems of the ‘same type’ 
as the test-problem that motivated the construction of T;.call R a reference 
class for T. If the problems of R are treated in the same way T treats the 


1 Neyman [1950], p. 271. 
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test-problem it was designed for, then for each member of R one can 
formulate a test just like T from the view-point of probability theory, 
i.e. based on a formally identical observable random variable, with 
formally identical hypothetical probability distributions on formally 
identical hypotheses tested and alternative hypotheses. Moreover, the 
outcomes of applying such tests are independent of the outcomes of other 
tests associated with R. 

It ‘follows’ from this presupposition that if the problems of a reference 
class R for a test T are actually treated by tests similar to T the proportion 
of treatments that will result in the rejection of a true hypothesis tested is 
approximately equal to the size of T, and the proportion of treatments 
that result in the correct acceptance of the alternative hypothesis is 
approximately equal to the power of 7.1 

It is worth noting that this presupposition lends itself to the view that 
for each hypothesis H associated with a test T and reference class R for T 
it is meaningful to talk about the (usually unknown) prior probability that 
an H-like hypothesis associated with R is true, since if p is the proportion 
of true hypotheses like H associated with R, it is natural to regard p as a 
prior probability. Thus the conceptual framework of NPT is quite con- 
genial to a frequentist bayesian approach to testing. Indeed, H. Robbins’ 
‘An Empirical Bayes Approach to Statistics’,? hailed by Neyman as ‘quite 
remarkable’ and as the possible ‘forerunner of extensive important 
developments’,® capitalises on precisely this idea. In fact, in his [1957] 
Neyman makes it quite clear that he has no objection to the use of Bayes’ 
formula in connection with prior probabilities of hypotheses, provided 
that the approach is of the ‘inductive behavior kind’, and the prob- 
abilities are not conjured up in an a priori manner. Perceptive readers will 
note that the arguments presented below have a frequentist bayesian 
flavour, in the spirit of von Mises’ development of statistical theory. 
However no explicit use of Bayes’ formula is made, in order to avoid 
giving the false impression that the arguments presuppose anything 
foreign to NPT. 


5 SIZE, POWER AND THE RELIABILITY OF TESTS 


The question to be answered is whether or not it follows from the long 
run interpretation of error probabilities that ‘best’ tests with low error 
probabilities always lead to prudentially justified decisions for a decision- 


1 For problems involving composite hypotheses, the same sort of thing holds, but the 
formulation is more complex. An illustration is provided in the Appendix. 

2 Robbins [1955]. 

3 Cf. Neyman [1957] and Robbins [1963]. 
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maker who regards the error probabilities as acceptable upon considera- 
tion of the cost of erroneous decisions, the cost of experimentation etc. 
Let us put ourselves in the position of the clinic of Neyman’s TB example. 
Neyman’s operational interpretation does show that the test has desirable 
features, considered as a rule to be employed many times. But what needs 
to be shown is that these features, or others that follow from the interpreta- 
tion, are such as to justify its application to each individual case. I am not 
here raising the abstract philosophical problem of whether or not particular 
decisions or guesses can be justified by the long run success of the general 
policies they are instances of. Rather, I am adopting the view, implicit 
in the conceptual framework of NPT, that they can be, and am merely 
asking if the low size and high power of the test guarantees that this is one 
of those cases. Consider an example of the simplest kind of situation where 
they can be: I am offered a choice between (7) gambling 10 dollars on a 
single roll of a die I take to be fair such that I win if a non-six is rolled, 
and lose otherwise; (2) the same gamble with ‘even number’ substituted 
for ‘non-six’ and ‘odd number’ for ‘six’; (3) not to choose either gamble. 
The choice of (x) is prudentially justified for me because I believe that 
guessing ‘non-six’ will be right more often than guessing ‘even number’ 
in a long run of similar situations, and also that the relative frequency of 
correct guesses among those made on non-six is so high as to justify the 
risk of losing ten dollars. An equivalent description of my state of mind 
is that (since I regard the die as ‘fair’) I believe that the rule to guess 
‘non-six’ is more reliable than the rule to guess ‘even number’, and that 
the former rule is reliable enough to warrant risking the loss of ten dollars. 
The point of this example is that in view of the fact that Neyman’s 
diagnostic test is ‘best’, his operational interpretation of its size and power 
justifies applying it to the outcome of the X-ray examination of a patient 
if and only if it follows from the interpretation and all the clinic’s informa- 
tion that the test is reliable enough for outcomes of this type to warrant 
the risk of a mistaken diagnosis. I will argue that Neyman’s interpretation 
never justifies applying the test to specific cases. Before presenting this 
argument, it is necessary to precisely define the concept of reliability, 
and to spell out this principle of justified decision-making. 

The reliability index of test T for experimental outcomes of type E in 
reference class R is the proportion of those decisions based on outcomes 
like E that would be correct were each member of R actually treated by 
tests like T. We will hold that a decision-maker is prudentially justified 
in following the dictate of a test for the outcome of an experiment for which 
the test was designed if and only if (r) the outcome is ‘normal’, and (2) 
the narrowest experimental classification of the trial’s outcome for which’ 
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he has ‘adequate’ information about the reliability of T (for his chosen 
reference class for T) is such that the reliability index of T for this classifi- 
cation is high enough to balance his risk of error, and (3) no other test 
based on the experiment is known by him to be more reliable for this 
classification. A test is good for a decision-maker (excluding the cost of 
experimentation) if and only if he is prudentially justified in following its 
dictate for each normal outcome of the experiment. Of two tests designed 
for the same experiment, one is better than the other if it is at least as 
reliable as the other for every possible experimental outcome and more 
reliable for at least one such outcome. 

Let y stand for the percentage of moderate cases of TB among some 
large class of potential examinees, say all adults of some socio-economic 
class in a particular region. Thus y may be taken as the percentage of 
hypothesis of the reference class of the form HY that are true. Let z stand 
for the percentage of people in the class who do not have TB. The 
average frequency per hundred potential examinees with which the twelve 
possible categories would occur is given by Table II below, which is 
computed from Table I in an obvious way. 


TABLE II 
k= o I 2 3 4 5 
Hy true o-ory o-o77y 0-2349 0:346y o-259y 0-078y 
zero for 
H, true 09517 0:0479% ooorg 0-00001z practical purposes 


It follows from Table II that the proportion of tubercular possible 
patients among those who would receive zero positive X-rays were the 
test applied to every possible patient is approximately o-ory : (o-o1y+ 
o-9512). The reliability index of the test for such outcomes (k = o) is one 
minus this number, or 0-951z: (o-o1y-++-o-9512). Hence even if we 
assume that z = 100—y, as we must in order to hold that it is rational to 
base decisions on size and power, the reliability of the test for rejection of 
H, (negative diagnosis) cannot be computed without knowledge of the 
value of y. In fact, we see that it can range from zero to one—from complete 
fallibility to complete infallibility. In general, unless the size of a test is 
zero (which it never is in real life), low size, with or without high power, 
does not guarantee that a test has high reliability for those outcome types 
that lead to rejection of the hypothesis tested. There is nothing para- 
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‘doxical about this: The size of a test is the maximum relative frequency 
of erroneous rejections of the hypothesis tested among all members of the 
reference class, whereas the reliability of a negative diagnosis is determined 
by the relative frequency of erroneous rejections among those members of 
the reference class for which a rejection will occur. These values can be 
radically different. It would be as stupid to regard one minus the size of a 
test as indicating the reliability of the test for rejection of the hypothesis 
tested as it would be for a male American of age fifty who has had a major 
coronary attack to regard the proportion of all fifty-year-old American 
males who will suffer a fatal coronary prior to their fifty-first birthday as 
indicating his chance of having a fatal coronary prior to his fifty-first 
birthday. At the risk of labouring the obvious, consider the following 
‘test’ procedure: Reject Hp, i.e. give a negative diagnosis, if and only if 
the patient has red hair and blue eyes. Suppose that the presence of these 
attributes is independent of the presence of TB and that one patient in a 
hundred has red hair and blue eyes. The size of this ‘test’ is 0-01; yet a 
negative diagnosis determined by it is exactly as reliable as one determined 
by the ‘test’ which rejects H, if and only if the patient’s X-rays reveal 
that he has bones in his chest, which has a size of one. They are equally 
reliable because the proportion of negative diagnoses that are erroneous is 
y/t00 for both of them. 

If one employs the inductive behaviour approach to problems of testing, 
he may utilise his subjective beliefs in his appraisals of tests. Now, if the 
clinic has enough information about the incidence of TB in its class of 
potential patients to know the hypothetical probability distributions 
governing their radiologist’s diagnoses, it is probably able to set upper 
and/or lower bounds for the rate of infection in this population. Suppose 
that the upper bound is 3 per cent. Then the upper bound of the pro- 
portion of patients with TB among those for whom k = o is 0-03+ 
(0-03-++-0-951 . 97) = 0-00032. Thus negative diagnoses would be erroneous 
only 0-032 per cent of the time, #.e. they would be 99.97 per cent reliable. 
Even though the clinic would be off by a factor of 30 it took the size of the 
test as indicating the proportion of negative diagnoses that are mistaken, 

‘its error would be innocuous. However the situation is quite different for 
acceptances. It is natural to suppose that high power guarantees the 
reliability of acceptances of H, (positive diagnosis), since if acceptance of 
H, when H, is false is rare, it would seem that acceptances are usually 
correct. But calculations like that made for the reliability index of k = o 
show that neither size nor power determines the reliability indices for 
k = 1, 2, 3, 4, 5. For example, if y is at most 3, the reliability index of the 
test for k = 1 is at most 0-05. In other words, at least g5 per cent of 
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decisions for this case would be wrong. In order for the power to ade- 
quately reflect the reliability of a positive diagnosis when k = 1, y has to 
be at least as high as 92. Few populations would have a rate of infection 
remotely near this value. For k = 2 the reliability is at most 7/8 if y is at 
most 3. On the other hand, the test is practically infallible for k = 3 if y 
is at least 0-1, and absolutely infallible for k = 4 and k = 5. Again there 
is nothing paradoxical in all this: One minus the power of a test is the 
maximum proportion of incorrect acceptances among all members of the 
reference class, whereas the reliability of an acceptance for a given value of k 
is determined by the ie cas of incorrect acceptances among all cases 
with this value of k. 

In spite of the fact that there is no paradox, many readers may find it 
difficult to conclude that the feeling of security high power and low size 
provide is purely illusory, and may thus suspect that something must be 
wrong with the analysis. In order to allay such suspicions, I will try to 
show that they do have something to do with reliability, but that in the final 
analysis their function has nothing to do with assessing the reliability of 
specific decisions. Let P,(k) (P,(k)) denote the probability of k positives 
if H,(H,) is true. The average reliability of rejections of Hy is 

5 [ zP(k) _&P,(h)+-yPol(h) | e g . power 

zP (k)+yP (k) E(2P(k)+yP(k))] z. power+y . size’ 
where all summations are over those values of k which lead to rejection 
(for the TB test only one value). Thus if the size of a test is small in relation 
to its power (as it is for the TB example) and if the proportion of the 
hypotheses tested in the reference class that are true is not large (as is the 
case of the TB example), rejections are highly reliable on the average. Of 
course, this is compatible with low reliability for specific kinds of rejection, 
and low reliability for rejections on the average when the latter condition 
is not met. 

By similar computation it can be shown that the average reliability for 
acceptance of the hypothesis tested is 

y(1—size)~ ((1—size}y+(1—power)z). 

Thus if the power is large in relation to the size and the proportion of 
hypotheses tested in the reference class that are true is not small, the 
average reliability of acceptance is fairly close to one. For the T'B example, 
where there are good reasons for doubting that this latter condition is met, 
the average reliability of the test for outcomes that lead to acceptance 
should not be regarded as small. For example, if y is at most 3, the average 
reliability for acceptance of H is at most 37 per cent in spite of the fact 
that the test is virtually infallible for k = 3, 4, and 5. 
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Some functions of size and power are also illuminated by the following 
considerations: (z) The average reliability index for the test as a whole is 
((100—y)/100) . power-+(y/100) . (1—size). Thus the average reliability 
is at least as great as the smaller of the power and one minus the size. 
(2) The overall success-to-failure ratio for rejection = (power/size) 
times ((100—y)/y)); the overall success-to-failure ratio for acceptance 
is ((1—size)/(1—power)) . y/(100o—y). Now consider a test that accepts 
or rejects H, by pure chance, t.e. a test that rejects H, if a trial of certain 
random device has one outcome and accepts otherwise. Let p be the 
probability of the outcome type that leads to rejection. The frequency 
table analogous to Table II is given by 


TABLE III 
reject Hy accept Ho 
Hy true yp yı — p) 
H, true (100—y)p (100—y) . (1p) 


The success-failure ratio for rejection for this test is (100—y)/y and the 
ratio for acceptance is (y/1oo—y). Comparing the success-failure ratios 
for the two tests it is easily seen that the overall success-failure ratio for 
rejection on Neyman’s test is power/size times greater (95-1 times greater) 
than the same ratio for the test based on pure chance. Thus the ratio of 
power to size does indicate in quite a direct way the overall reliability of 
rejection of the hypothesis tested relative to guessing by pure chance. 
Similarly, the success-failure ratio for acceptance is (1—size)/(1— power) 
times greater (19-8 times greater for the TB test) than the same ratio for 
the chance test. 

(3) Finally, I assert without proof, that for a given experiment, any 
drop in power with the size held constant or reduced cannot result in a 
test that is better for every outcome, and must result in a test that is less 
reliable for at least one outcome type. Similar remarks hold for size, 
mutatis mutandis. 

The above consideration of some of the functions of size and power 
show that low size and high power are desirable if one is concerned solely 
with the average or overall performance of a test. However, once an 
experiment is performed, and a decision that really counts has to be made, 
the average performance of a test is irrelevant to determining what 
course of action is best to take. 
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6 TWO WEIGHTY OBJECTIONS TO THIS ANALYSIS 


Two very trenchant criticisms of the argument developed above can be 
made: (z) The TB example is atypical of most problems of testing in that 
decision-makers usually do not have enough information to set upper 
and/or lower bounds to the truth-frequency in some relevant reference 
class of the hypotheses tested, hence the only outcome-type for which 
there is ‘adequate’ information about the reliability of is the broadest 
possible one, namely, ‘outcome of the experiment’. This is so because in 
the absence of prior information about truth-frequencies, the only 
reliability index that can be known is the overall one, since the overall 
reliability of a test must be at least as high as the smaller of its power and 
one minus its size. Thus high power and low size guarantee that a test is 
good to use (cf. p. 208) if it is an NPT ‘best’ test, in typical cases of testing, 
and in such cases size and power are the only available indices of reliability. 
A closely related criticism is that typically there is a wide variety of 
reference classes, and by choosing the most favourable one a researcher 
can justify any decision he wants to, whereas the overall reliability of a 
test is independent of the reference class and hence is the only reliability 
index that should be used, in order to insure the objectivity of decision- 
making. 

(2) The second objection is that I have misconstrued the nature of 
NPT, that its goal is not reliability, but rather that of minimising the long- 
run cost of mistaken decisions. Tests that are good to use (cf. p. 208) are 
ipso facto tests whose long run costs are acceptable to the decision-maker, 
but the reverse need not be the case, as the TB example shows. Thus I have 
not refuted NPT on its own terms. 

Before replying to these criticisms, a third, but less weighty objection 
should be noted. It is that there seems to be no way to begin to compute 
reliability indices for specific outcomes when either the hypothesis tested 
or the alternative hypothesis is composite or the observable random 
variable the test is based on is continuous. I regard this criticism as less 
weighty than the others because (a) it is not my intention to provide an 
alternative to NPT (although I will make a few suggestions along these 
lines), but rather to criticise it, so if I have shown that it fails for the sim- 
plest and logically most central kind of testing problem my task is com- 
pleted, and (b) I believe that such reliability indices can be computed in 
a rather natural manner, as I will attempt to show in the appendix to the 
paper. . 

My reply to criticism (x) consists of several parts. The first is that I 
disagree with its premise. It seems to me that the TB example is typical 
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of problems of testing, in that there usually exists a reference class the 
decision-maker is quite willing to adopt and for which he has enough 
information to set reasonable, informative upper and/or lower bounds to 
the truth-frequencies of the relevant hypotheses. Although this view is 
likely to strike many readers as dubious, space does not allow me to argue 
for its plausibility. However, I am prepared to give it away, because my 
case does not rest on it: The conclusion of (r) does not follow from its 
premise, since highly informative relative reliability ratios for specific 
outcome types may always be computed. Even if the premise is true, 
there is no need to fall back on an overall reliability index. This will be 
demonstrated in the next section. 

The argument based on the possibility of wilful selection of a reference 
class to further ones goals is faulty because it misses the point that NPT 
is a theory of prudential decision-making. If a person’s goal in a specific 
problem of testing is to keep himself happy or to fool others the theory 
ought to allow him to pursue it, not to provide checks on his doing so. 
Moreover, it is to the credit of the analysis presented above that it enables 
a decision-maker who wishes to be objective and impartial to consider the 
sensitivity of reliability indices to different values of truth-frequencies, 
and thus provides him with rational grounds for refusing to make decisions 
when so doing would be inimical to his interests. 

Another telling reply to criticism (z) is that if the only reliability index 
decisions can be based on is the overall or average reliability of the test, 
then NPT’s asymmetric treatment of error probabilities is utterly without 
any rational basis. If the argument were good, NPT should address 
itself to finding an optimum balance between error probabilities as Hodges 
and Lehman do in their [1964], rather than singling out one as most im- 
portant and trying to minimize the other within the limits imposed on 
the first. Since this asymmetric treatment of error probabilities is absol- 
utely central to NPT, to argue in the manner sketched by (z) is fatal 
to the theory .. . if criticism (2) is not correct 

In support of criticism (2) it must be confessed that there are passages 
in the writings of both Neyman and Pearson which suggest that the 
criterion of prudent decision-making that underlies NPT is that a decision 
is justified if it is made in accordance with a rule that minimises the long 
1 A reviewer suggested that the asymmetry of error probabilities is merely a mathematical 

convenience and has no methodological consequences because in practice both error 
probabilities and sample size are decided upon simultaneously. This argument is a non 
sequitur: The distinction between the hypothesis tested and its alternative, which has 
important methodological consequences for tests involving composite hypotheses, is 
essentially based on such an asymmetry. Conflicting critical regions result from different 


choices as to which error is the most serious and hence which hypothesis is to be labelled 
the hypothesis tested. $ 
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run cost of erroneous decisions. Their writings are not clear on this point. 
If this is the case, and the purport of the long run operational interpreta- 
tion is to show that tests with low size and high power meet this criterion, 
the following remarks are in order: (a) NPT is not suitable for problems 
of testing that do not involve repetitive testing (of the sort one finds in 
industrial quality control or large scale diagnostic testing) because common 
sense dictates that low long term cost is not a sufficient condition of 
goodness for a rule to be employed only once or a few times; (b) the theory 
is still inadequate on its own terms because tests with proportionally low 
long term costs of erroneous decisions may have a proportionally high 
cost of operation for certain outcome-types, as the case k = 1 for the TB 
example indicates. It would seem that a person using such a test comes 
close to contradicting himself if he ignores such cases on the ground that 
he is only interested in minimising long term costs and that since they 
occur infrequently (at most 7-7 per cent of the time for k = 1) the overall 
cost is low and that’s good enough for him. If one is interested in mini- 
mising long run costs, they should be reduced wherever they can. One 
argument for basing decisions on size and power alone, relative to this 
weak criterion, is similar to the one considered in (x), namely, that there 
is not enough prior information to compute proportional costs for each 
outcome type. It will be shown in the next section that this is not an 
adequate defence. Another argument is that the problem calls for a 
decision for every outcome, so even though there may be outcome-types 
for which the test leads to high long run cost, the luxury of suspending 
judgment is not available. This may be a tolerable defence in some indus- 
trial problems, but in general it is quite weak. Could the clinic of the TB 
example justify to itself a mistaken diagnosis given to a patient for whom 
k = 1 on the ground that the clinic has to give some sort of diagnosis for 
each patient, and that they are justified in their practice by their overall 
track record? If the clinic has the proper attitude of concern for the welfare 
of each patient, itis a very poor justification. (c) Even though I have not, 
strictly speaking, refuted NPT on its own terms if its goal is to provide tests 
with low long run cost of erroneous decisions, by showing the difference 
between this goal and that of providing reliable tests, I have shown that the 
theory is too narrow in its proper scope of application to bother refuting. 


7 LIKELIHOOD RATIOS AND RELIABILITY 

Let Q(k) be the ratio of the reliability to the unreliability of the test in the 
TB example for the outcome k positives. (By the unreliability of a test I 
mean one minus its reliability.) Q(k) may also be interpreted as the ratio 
ef successes to failures for the test for the outcome’k positives. Consider 
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again the test which rejects H, by pure chance with a probability of p. 
Let R(k) be Q(k) divided by the success-failure ratio for the chance test 
for that diagnosis Neyman’s test calls for when k positives are observed. 
Referring to Tables II and III we obtain a table of relative success- 
failure ratios: 


TABLE IV 
RELATIVE SUCCESS-FAILURE RATIOS FOR NEYMAN’S TEST 


RO) Ri) R(2) R(3) R4) R(5) 
infinite for 
ogst/oor = œo77/0-048 = 0-234/0-001 = 0-346/0-00001 = practical 
957I r5 234 34,600 purposes 


Thus the ratio of correct to incorrect diagnoses among patients who 
receive o positives would be 95-1 times better than the same ratio for those 
diagnosed the same way (negative) by pure chance. This ratio is indepen- 
dent of the value of y as long as y is neither zero nor one. The ratio for 
k = 1 shows objectively why any decision for this case should not be 
judged reliable in the absence of informative upper and/or lower bounds 
for y, and that there is a means for detecting cases for which a test may not 
be reliable in the absence of information about truth-frequencies of hypo- 
theses: For tests involving a simple hypothesis H, tested against a simple 
alternative H, and a discrete sample space, the ratio of likelihoods 
P(E ; H,)/P(E ; H,) is the relative success-failure ratio for a test that 
accepts H, when E occurs, relative to this outcome-type. Similarly, the 
ratio P(E : H,)/P(E ; Hp) is the relative success-failure ratio for outcome 
E for a test that rejects H, when E occurs. This interpretation can be 
placed on likelihood ratios even if the number of items sampled is not 
fixed in advance of the experiment. 


8 RELIABLE TESTS 


The foregoing analysis of tests of simple hypotheses against simple 
alternatives with a discrete sample space suggests that if one subscribes 
to the conceptual framework of NPT and wants tests of this type to be 
reliable, he should reject the concepts of size and power on the ground that 
they are irrelevant to decision-making. Next he should be aware of the 
possibility that if he fixes his sample size in advance, there may be experi- 
mental outcomes for which neither acceptance nor rejection of one of the 
hypotheses is reliable. If the nature of his problem makes sequential 
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observation and decision-making impossible, he may have to either take 
more observations or give up if one of these outcomes results. Ideally, 
observations should be taken one at a time or in small groups (in order to 
avoid unnecessary expense), and the following calculations performed: 
(a) of the likelihood for all the data so far obtained of both hypotheses; 
(b) if there is a well-defined, natural reference class for which trustworthy, 
informative bounds on truth-frequencies can be set, reliability indices 
using the likelihoods computed under (a) and these bounds in the manner 
done in 5. Reject an hypothesis if and only if the reliability index for 
rejecting for the data obtained to date is high enough to meet practical 
needs. Similarly, accept an hypothesis if and only if the reliability index 
for this type of decision is high enough. Otherwise, take more observations 
or give up. If such a reference class is not available, or if it is desired to 
avoid basing a decision on such relatively subjective factors (as is typically 
the case in scientific research) use the likelihoods of (a) to compute 
relative success~-failure ratios; then reject an hypothesis if and only if the 
relative success-failure ratio for its rejection is high enough to meet the 
security needs of the problem. Similarly, accept an hypothesis if and only 
if the relevant success-failure ratio is high enough. My own feeling is 
that for scientific work relative success-failure ratios on the order of a 
thousand or more will be required. Since such ratios are as ‘objective’ as 
the probability models adopted for the experiment, and do not depend 
on the choice of a reference class, they serve the goals of scientific research 
admirably. The difference between the approach set forth here and Wald’s 
sequential analysis is that on the latter the boundaries on likelihood ratios 
for acceptance and rejection are determined by setting down error prob- 
abilities like size and one minus power and then engaging in computation 
to find these limits; no direct interpretation is put on the limits themselves. 
Using the approximations of Lehmann [1959], for tests of simple hypo- 
theses Wald’s theory recommends the test which, on our terminology, 
rejects the hypothesis tested if and only if the relative success-failure 
ratio for its rejection is greater than the pre-assigned power divided by the 
pre-assigned size, accepts it if the relative success-failure ratio for accept- 
ance is greater than one minus the size divided by one minus the power. 
Thus for a test with the low size of o-o1 and the high power of 0-95, 
Wald’s approximate sequential ratio test rejects the hypothesis tested if the 
relative success-failure ratio for rejection is a mere 95, and accepts it if 
the success-failure ratio for acceptance is an even lower 19-9. Thus error 
probabilities are also liable to be quite misleading in connection with 
Waldian sequential tests, t.e. to give the consumers of the method a false 
sense of security. (d) The final point suggested by my analysis is that 
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decision-makers should use likelihoods opportunistically, not worry 
about pre-designating sample sizes and using sophisticated mathematical 
techniques, in short, use likelihoods in a common-sense way. 

The above recommendations are intended for those who find the con- 
ceptual framework of NPT congenial. Personally, I do not. I disagree 
with Neyman’s views on the concept of inductive inference. I think that 
researchers are concerned with more than taking appropriate courses of 
action in the light of experimental data, that to define statistical theory as 
a theory of prudential decision-making is to unduly limit its scope. More- 
over, the notion of a reference class, and long-run justifications based on 
it are too abstract and problematic for my taste. I find a credibilist Bayesian 
approach most congenial. But that is another story. It is interesting to note, 
however, that the approach recommended above is exactly parallel to the 
Bayesian approach to testing put forward by Edwards, Lindman and 
Savage in their [1963], the difference being that truth-frequencies in a 
reference class correspond to personal prior probabilities of hypotheses, 
the reliability index for acceptance of an hypothesis H on outcome E 
corresponds to the posterior probability of H given outcome E, and the 
relative success-failure ratio for acceptance of H on outcome £ is the ratio 
of the posterior to prior odds in favour of H given E. In fact, such a 
Bayesian approach will treat tests of simple hypotheses in exactly the same 
way. Indeed I find it surprising in the light of E. S. Pearson’s remarks 
that ‘. . . it was Neyman, brought up in the tradition of the continental 
mathematical school, who held longest to the idea of retaining in our 
theory measures of prior probability’ and that ‘We also considered how 
far inferences and decisions could be based on the numerical values of 
likelihood ratios’, that NPT took the form it did, because they surely 
must have been aware of the misleading character of size and power. 


9 WALD’S OPERATIONAL INTERPRETATION OF SIZE AND POWER 
Wald put forward in his [1947] an interpretation of size and power quite 
different from that of Neyman and Pearson. Wald takes the reference 
class for a test to be an hypothetical class of very many repetitions of the 
test. For example, for the TB test we are to imagine the test for Y being 
repeated a large number of times and the test actually performed as being 
an arbitrary one of these repetitions. There are two possible frequency 
distributions governing the outcomes of these experiments, one given by 
H} and the other by HY. If it is the first then we can be wrong only if k=o, 
hence it is practically certain that we would be wrong about 1 per cent of 
the time. On the other hand, if H{ governs the outcomes, it is practically 
1 Barnard and Cox [r962], p. 55. 
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certain that we would be wrong about 4-9 per cent of the time. Hence the 
reliability index of the test is either 0-99 or o-g51. The important 
difference between this reference class and the one considered earlier is 
that for it the reasoning involving likelihoods as relative success-failure 
ratios is inadmissible, since mathematically it has the same structure as 
that given by Table II, with the added requirement that either y or z 
equals zero. Thus likelihood ratios have no ‘meaning’ in terms of reliability. 
Moreover, the only statement about the reliability indices of the test for 
each of the specific outcome-types that can be made is that they are either 
zero or one. So it would appear that not only are power and size directly 
related to reliability, they provide the only informative measurements of 
reliability. It would seem that my strictures against size and power are 
unjustified on Wald’s interpretation of them. 

The seductive character of this argument should not blind one to the 
fact that the classification k = o, k = 1, etc. is the narrowest one for which 
there is knowledge of the reliability of the test. It is generally held that the 
narrowest relevant classification for which trustworthy statistical data is 
available should serve as the focus of guessing by frequencies. The fact 
that the reliability index based on the observed value of k is either zero 
or one does not render it less informative than the overall reliability index. 
For example, if k = 1, and the dictate of the test is followed, wouldn’t the 
clinic be deluding itself if it regarded its decision as trustworthy on the 
ground that the overall reliability of the test is at least 0-95? By insisting 
that on Wald’s interpretation the overall reliability index is not a relevant 
guide to decision-making we do not commit the fallacy of choosing so 
narrow a reference class for our decision as to vitiate guessing by fre- 
quencies. I have in mind here the fallacy that would be committed by a 
chain smoker who ignored the correlation between smoking and lung 
cancer on the ground that the only statistical data that is relevant to his 
decision to continue or quit smoking is the proportion of people identical 
to him in every possible respect who will get lung cancer, and all that can 
be known about this proportion is that it is either zero or one. 


IO APPENDIX 
AN ANALYSIS OF NEYMAN’S VERSION OF FISHER’S LADY TASTING 
TEA PROBLEM 


In this appendix it will be shown that the objections raised against NPT 
can be sustained for more complex ‘best’ tests. Neyman’s treatment of 
this example can be found in his [1g5o0].+ 


1 Neyman [1950], pp. 272-82. 
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A researcher is to investigate a certain lady’s claim that she can taste 
the difference between cups of tea which had the milk poured in first and 
those in which the tea was poured in first. The hypothesis tested is the 
null hypothesis that the lady cannot discriminate. In Neyman’s experi- 
ment the lady is to be presented with n pairs of cups of tea, each containing 
one with the milk poured in first and the other with the tea poured in first. 
Independence of the lady’s classifications is assumed. Let p be the prob- 
ability of a correct classification of a single pair on this set-up. The null 
hypothesis is equivalent to the hypothesis that p = 4. Neyman takes the 
alternative hypothesis to be p > 4. Let X, be the number of correct 
identifications for the n pairs the lady tasted. Neyman considers tests 
with # = 5, 10, 40, 60 and 80. Each test is of the form reject p = 4 if and 
only if X, > tn Let Ba (p) stand for the power function of such a test. 
Neyman argues that for a size of approximately 0-05, the tests with n = 5 
and n = 10 would be judged inadequate by the researcher because for 
values of p between 0-60 and o-80 the power of such tests would be low. 
For example for n= 10, = 8 if the size is to be about o-5 and 
Byo,g(0-6) = 0-167 and Byy,g(0-8) = 0-6. Neyman argues that while tests 
with n = 60 and n = 40 with a size of 0-05 have fairly good power func- 
tions, the researcher would probably feel secure in accepting the null 
hypothesis only for the test with n = 80. For this test 4, = 48, the size is 
0-05 and Bgp, 4g(0-6) is already 0-5 and Bao, 4g(0-'7) = 0-95. 

New let us consider what the reference class for each of these tests is 
like. It would consist of a class of problems which can be treated by observ- 
ing a random variable Y,, which is the number of successes on n trials of a 
bernoulli process with probability @ of success and for which the hypo- 
thesis tested is 0 = $ and the other is that 0 > 4. Such would be the 
formal structure of the reference classes. A narrower reference class can 
be formed from such a class by taking the subject matter of the hypotheses 
0 = 4 into account. For example, we could consider the class of problems 
that involve perceptual discrimination and for which 0 = } is a null 
hypothesis. By such a process the researcher may be able to formulate a 
reference class for which he has trustworthy information about the 
percentage of true hypotheses of the form o-5 < 0 < b. 

Let y be the percentage of true hypotheses in the selected reference 
class of the form @ = 4 and z the percentage of hypotheses of the form 
6 > 4 that are true. Of course, there is no guarantee that y+ = 100; 
the NPT long-run justification would assume that the sum is close to 100. 
Let F(b) be the percentage of hypotheses of the form o-5 < 0 < b that 
are true. F must be monotonic increasing. Let us assume that the form of 
F is that of a linear function of b. Since F(1) = 2 and F(4) = 0, F(b) = 
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2%(b—o-5). (This assumption is not crucial to the argument that follows, 
because as long as F increases at a moderately uniform rate the values 
computed below are quite insensitive to the exact form of F.) 

Consider the test for n = 80, which rejects 6 = $ when Yy > 48, 
which has a size of approximately 0-05 and which is uniformly most 
powerful among tests of this size for n = 80. After presenting the power 
function for this test Neyman remarked of it: ‘. . . it is not improbable 
that the researcher will consider the prospects of success in his experiment 
as satisfactory’. Let us consider what his prospects are if he obtains the 
results X, = 39, 42, 45, 47, 48, 49, 53. The average joint frequency of 
occurrence of Y, = 39, 42, etc., and the truth of 6 = 4 and 0 >} per 
one hundred members of the reference class is computed using the fact 
that the joint frequency of Y, = x and 0 > 4 is given by 


f. Bin, x, b)dF(b), 


where Bin(n, x, b) is the value a binominal probability function with a 
probability b of ‘success’ on each trial, based on n trials takes on for x 
successes. The values of this integral can be obtained from tables of the 
cumulative binomial distribution when F(b) is a polynomial in b (cf. p. xliii 
of the Harvard University Tables of the Cumulative Binomial Distribution). 


TABLE V 
Yo= 39 42 45 47 48 49 53 58 
8 =4 o087y ooy o-048y o026y oo18y oorzy o-0013y 0-00003y 


6+ 4 oorox o-o18% 0:022% 0023% 0-024 0024% 0-:02473 004472 


If it is assumed that z is approximately 1oo—y, the relative success- 
failure ratios for the test are given by the following table: 


TABLE VI 
RELATIVE SUCCESS-FAILURE RATIOS FOR n = 80 


RG) RU2) Ras) RQD R48) R49) R53) R58) 


8-7 4°5 22 ier 1:33 20 | IQ9O 823-3 
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The values of Table VI show dramatically how misleading the size and 
power of the test are. If the null hypothesis is accepted by virtue of Xs 
taking on the value 47, the decision is just about as reliable as one based 
on the flip of a coin. The same holds for a rejection based on forty-eight 
correct identifications. This table speaks for itself. It shows what any 
researcher with common sense would know: That far from the prospects 
of success being high, the middle values of X,,) yield outcomes to which 
the test should not be applied. Limitations of space forbid further analysis 
of the test for n = 80 but it is worth noting that if F(b) increases sharply 
with b, acceptances gain in reliability and rejections decrease for the middle 
values. For example, if F(b) increases proportionally to bt, R(47) = 2-36 
and R(48) = 0-61. 

Let us compare the test with n = 80 with the UMP test with n = 10, 
which Neyman rejected as having too low a power function to be of any 
use. 


TABLE VII 
JOINT FREQUENCIES PER AVERAGE 100 ITEMS 


Yio = 4 6 7 8 9 10 
e=} 0-2059 0:2057 O11797 0:0447 o-oloy o-oolry 
Oe 4 o-osog 0:132% or6rg ©1775% orig orig 


This test rejects the null hypothesis if an only if X} > 8, and has a size of 
0-055. It is easily computed from Table VII that the relative success- 
failure ratios are as follows: R(4)= 4-1, R(6) = 1-5, R(7) = 0-73, 
R(8) = 4-0, R(g) = 18-1, R(10) = 181. Comparing these relative success- 
failures with that of the test with n = 80, it is easily seen that for rejections 
this test fares better than the larger test for values close to the boundary 
of the critical region. Acceptance of the null hypothesis when Xj) = 6 is 
more reliable than acceptance based on Xg = 47. Decisions based on the 
smaller test are about as reliable as those based on the larger for middle 
values of Xg The larger test, of course, yields more reliable decisions 
for high or low values of the variable, and this provides the rationale for 
taking large samples: The hope is that a very trustworthy decision can be 
made. But no matter how large the sample is, there can never be any 
guarantee that all experimental outcomes provide the basis for a reliable 
decision. 

It is important to note that the above results were not obtained under- 
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strong assumptions about the truth-frequencies of certain classes of 
hypotheses. The only assumption used is the very weak one that F(b) is 
approximately linear in 6. The actual values of F did not enter into any 
of the computations. 
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Why did Einstein’s Programme 
supersede Lorentz’s? (II) 


by ELIE ZAHAR 


2 Einstein's Heuristics. 

2.1 Einsteins Appraisal of Classical Physics. 

2.2 The Discovery of Special Relativity Theory: Removal of the 
Asymmetry between Classical Mechanics and Electrodynamics. 

2.3 The Heuristic Superiority in 1905 of the Relativity Programme: 
Einsteins Covariance versus Lorents’s Ether. The Power of 
Einsteins Heuristics: Derivation of a New Relativistic Law of 
Motion and of E = me. 

3 Einstein's Programme Supersedes Lorentz’s. 

3x The Continuity between the Special and General Theories of 
Relativity. 

3.2 The Successful Explanation of the Perihelion of Mercury and its 
Role in the Further Development of the General Theory. 


2 Einstein's Heuristics. 

In section x I showed that Lorentz’s classical programme was progressive 
until after 1905—the year in which Einstein published his Theory of 
Special Relativity (hereafter referred to as S.R.T.). In the next two sections 
J shall try to deal with the following three questions. First what were 
Einstein’s reasons for objecting to the classical programme and hence for 
starting his own? (I have already shown in section x that these reasons 
could not have been of an empirical kind.) My second question is this. 
Once Einstein’s programme was launched, why did other scientists like 
Planck, Lewis and Tolman work on Einstein’s programme rather than on 
Lorentz’s?! Thirdly I shall try to answer the question, at what stage, if any, 
did the relativity programme empirically supersede Lorentz’s. 


2.1 Etnstein’s Appraisal of Classical Physics. 

Why did Einstein object to Classical Physics? Let me immediately say that 
the answer to this question will not be a psychologistic answer; I shall not 
for example be indulging in speculations about Einstein’s: childhood. What 


1 My answer will also show that Kuhn’s theory of paradigm-change is not applicable to the 
Einsteinian Revolution. (cf. below, pp. 237-8.) 
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I shall try to show is that certain (unfalsifiable) metaphysical beliefs—at 
first sight rather vague and empty—which Einstein held, correspond to 
heuristic prescriptions which, when skilfully applied to particular cases, 
become specific and powerful tools for the invention of scientific theories. 
Thus metaphysics can play an important role in starting a new programme, 
especially when the existing one is empirically successful. Of course, the 
triumph of a programme can be achieved only by empirical means. How- 
ever interesting its metaphysics, the programme will ultimately be judged 
by its ability to anticipate facts. I should like to formulate, as clearly as I 
can, two devices which formed part of Einstein’s heuristics. To these 
devices correspond metaphysical beliefs which Einstein articulated in his 
later years. 


(I) Theories have to fulfil the so-called internal requirement of coherence.+ 
Science should present us with a coherent, unified, harmonious, simple, 
organically compact picture of the world. The mathematics used in the theory 
should reflect the degree of internal perfection of the world. “The aim of 
science is, on the one hand, a comprehension as complete as possible of the 
connection between the sense experiences in their totality, and on the 
other hand, the accomplishment of this aim by the use of a minimum of 
primary concepts and relations. (Seeking as far as possible, logical unity in 
the world picture, i.e. paucity in logical elements.)’? 

Einstein went as far as asserting that reality, although independent of the 
mind, was nonetheless knowable a priori. His so-called aestheticism was 
not meant in any subjective sense but was linked to a definite metaphysical 
position. Because Nature is simple, scientific hypotheses ought to be 
organically compact. Simplicity or coherence are not aimed at because 
they please our minds or because they effect economy of thought, but 
because they are an index of verisimilitude. 


If it is true that the axiomatic foundations of theoretical physics cannot be 
derived from experience but have to be freely invented, can we at all hope to find 
the right way? Or worse still: does this ‘right way’ exist only as an illusion... To 
this I answer with complete confidence that this right way exists and that we are 
capable of finding it. In view of our experience so far we are justified in feeling 
that Nature is the realisation of what is mathematically simplest... It is my 
conviction that we are able, through pure mathematical construction, to find 


1 There is also an external requirement on theories, namely that they be consistent with 
empirical results. Thus Einstein writes: “The first point of view is obvious: the theory 
must not contradict empirical facts’ (cf. Einstein [1949], p. 21). Also: ‘The great attraction 
of the theory [General Relativity] is its logical consistency. If any deduction from it should 
prove untenable, it must be given up. A modification of it seems impossible without 
destruction of the whole.’ (Einstein [1950], p. 110. For Einstein, ‘logical consistency’ 
meant ‘coherence’ or ‘organic compactness’.) Fortunately Einstein did not follow this 

* rule. 2 Cf. Einstein [x950], p. 62. 
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those concepts and the law-like connections between them, which yield the key 
to the understanding of natural phenomena . . . The really creative principle is in 
mathematics. In a certain sense I consider it therefore to be true—as was the 
dream of the Ancients—that pure thought is capable of grasping reality.1 


I shall illustrate the importance of prescription (I) in 2.2. 
(II) The second heuristic device is more difficult to formulate. Its meta- 
physical underpinning is the claim that since God is no deceiver, there can 
be no accidents in Nature. All observationally revealed symmetries signify 
fundamental symmetries at the ontological level. Hence the heuristic rule: 
replace any theory which does not explain symmetrical observational situations 
as the manifestations of deeper symmetries—twhether or not descriptions of all 
known facts can be deduced from the theory. This will become much clearer 

with two examples. 


(a) The Induction Experiment. 


() zzz 


If we move a magnet with respect to the ether while keeping a conductor 
fixed, then, due to the variation of the magnetic field with time, an electric 
field arises in the whole of space. Let P be any point of the conductor at 
which an electron may be situated. In view of the Lorentz formula: 


F= (D+ a Ht), where D+ o adë =o 


the electron will experience a force which generates a current in the con- 
ductor. 

We now keep the magnet fixed and move the conductor with velocity v 
with respect to the medium. No electric field is created because H is static, 
i.e. independent of the time. The situation is very different from the pre- 
vious one, so we might expect the current in the loop either not to arise at 
all or at any rate to be different from what it was in the first case. However, 


in view of F = {5+ A i), where now D= o butë + 0, a current does 


arise, and, if the relative motion between the conductor and the magnet is 

the same as in the previous case, the current also turns out to be the same. 

This result is wholly explained by Maxwell’s theory; in other words, if we 

assume the existence of a preferred frame and accept Maxwell’s gp 
1 CÉ Einstein [1934], p. 116; my translation. 
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we can infer that the outcome of the experiment depends solely on the 
relative motion of the maghet and the conductor, and not on their absolute 
motion with respect to thé ether. Hence, this time without the aid of any 
auxiliary hypothesis, an ether theory yields the undetectability of the ether. 
Thus in classical electromagnetism there is a basic ontological difference 
between a situation in which a magnet moves in the ether (presence both of 
a magnetic and of an electric field) and one in which the same magnet is 
stationary (presence of a magnetic field alone). However, when we apply 
Maxwell’s equations to compute the current due to the motion of a con- 
ductor in the field created by the magnet, the result depends only on the 
relative motion between the magnet and the conductor. Thus there exists 
at the ‘observational’ level, a symmetry between the followingtwo situations: 
(a) magnet moving towards the conductor, and (b) conductor moving 
towards the magnet. This conflicts with the asymmetry obtaining at a 
higher level. Special Relativity eliminates the asymmetry: equations of 
exactly the same form apply, whether we choose the magnet or the con- 
ductor as our frame of reference. There are no separate electric and mag- 
netic fields but one anti-symmetric tensor which transforms globally. 


(b) Equality of gravitational and inertial masses. 


In Newtonian theory the inertial mass m, of a body represents its laziness, 
i.e. its capacity for resisting acceleration. Inertia is a primary irreducible 
property of matter which appears in the fundamental laws of motion. The 
gravitational mass m, is a measure of the body’s receptiveness to the gravi- 
tational field. According to Newton gravity is not a primary quality to be 
treated on par with inertia or impenetrability. Hence inertia and gravity 
ought to be independent properties. One should for instance be able to 
alter the gravitational ‘charge’ m, without affecting the inertia of the body, 
in the same way that one can alter the electric charge e while keeping the 


1 Exactly similar considerations as apply to the classical explanation of the induction 
experiment apply to Lorentz’s explanation of the Michelson result (cf. Part I, p. 115). 
Once we accept the existence of an ether as the carrier of the electromagnetic field, we are 
led to look upon the latter as a state of the substratum. Molecular forces are transmitted 
by the same medium, so they also form part of its state; we thus have a good reason for 
supposing that molecular and electromagnetic forces are similar, f.e. for accepting the 
M.F.H. (Molecular Forces Hypothesis). From this assumption follow the L.F.C. 
(Lorentz-Fitzgerald Contraction Hypothesis) and Michelson’s null result. There is 
something paradoxical in that, through postulating the ether as a universal medium, we 
are driven to the conclusion that it must be undetectable. Was it not dissatisfaction with 
this paradox so closely connected with the crucial experiment, which caused Einstein to 
look for another explanation? The answer is that Einstein had become aware of the 
paradox independently of Michelson, as is indicated in the first paragraph of his [1905] 
where the induction experiment is mentioned. What from Einstein’s point of view, was 
an unsatisfactory feature of classical physics is already evinced by Maxwell’s account of 

* the induction experiment and is, in this sense, completely independent of Michelson. 
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inertial mass m; constant. However, this is not the case: a doubling of m is 
instantaneously matched by a doubling of m,. Newton postulated that the 
two masses, m, and m,, are equal, but did not explain why. In other words, 
there is a symmetry between doubling m, and doubling m,, which is at odds 
with the disparity between the two properties of inertia and gravity. Let us 
note that the observational ‘cash-value’ of ‘m, = m,’ is the proposition that 
all bodies fall with the same acceleration in a given gravitational field. 

The problem can be put a little differently as follows. If a moving train 
suddenly decelerates, the passengers, being thrown forward, imagine that 
they are subject to a field of force to which they respond proportionately to 
their inertial masses. The Newtonian physicist will tell them that this field 
is a fictitious inertial field due to an inappropriate choice of coordinate 
system (the train); in fact, by virtue of their inertia, the passengers are still 
moving uniformly in Absolute Space (that is, if we neglect the attraction of 
the earth). 

According to the Newtonians this fictitious inertial field differs funda- 
mentally from the ‘real’ gravitational field created by the earth; it is only by 
accident, namely because m; = m, that all objects respond to the two 
fields in exactly the same way. 

Einstein eliminates this asymmetry between gravity and inertia by pro- 
posing that all gravitational fields are inertial; ¢.e. that all gravitational fields 
are created by a (local) acceleration of the frame of reference. To put it 
crudely: being thrown forward in a moving train and being attracted by the 
earth are basically one and the same phenomenon.’ It is no wonder that all 
bodies fall with the same acceleration, since it is the common frame which 
is accelerating under their feet. 

These prescriptions may be susceptible of a more precise formulation, 
but I leave this question open.? Whatever the case may be, the lack of a 
more accurate rendering in no way entails that the propositions in question 
must be given a subjective (or psychological) interpretation. Einstein’s 
metaphysical statements are admittedly vague, yet they may still correspond 
to real properties of an external world independent of the scientist’s mind, 
of his private feelings about harmony, perfection and the like. My main 
object will now consist in examining the role the above prescriptions played. 
in the genesis of S.R.T. In this specific context it turns out that these 
otherwise vague rules and propositions assume a very precise form, 
leaving no doubt as to their intended objective meaning. 
1-This is not strictly speaking true. In the case of the train the field is globally eliminable, 

whereas in the case of the earth the field is irreducible. 
4 Einstein himself thought ‘that a sharper formulation would be possible. In any case it turns 


out that among the augurs there usually is agreement in judging the inner perfection of 
the theories and even more so the degree of external confirmation’ (Einstein [1949], p. 23% 


xy 
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2.2 The Discovery of S.R.T.: Removal of the Asymmetry between Classical 
Mechanics and Electrodynamics. 


Let us look at the more general features of Einstein’s objections to Classical 
Physics. According to Einstein, one of Maxwell’s and Faraday’s greatest 
contributions to science was the introduction of the field as a constituent of 
physical reality to be treated on a par with other constituents such as 
corpuscles and electric charge. Lorentz’s electromagnetic theory confronts 
us with a dualism to whose removal Einstein was to devote much of his life: 
on the one hand there are discrete charged particles whose motions are 
governed by Newton’s laws, and on the other hand a continuous field 
obeying Maxwell’s equations. It is true that the charged corpuscles and 
their motions generate the field; but, once started, an electromagnetic 
disturbance propagates itself with velocity c independently of its source; 
the field may act back on the particles, thereby modifying their motion. 
Fields and particles are therefore ontologically on a par. One way of 
resolving this dualism is to explain the behaviour of the field in terms of the 
mechanical properties of an all-pervading medium. Lorentz clearly re- 
cognised that all such attempts throughout the nineteenth century had 
failed; he was about to try a solution in the opposite direction, and in parti- 
cular to explain inertial mass in electromagnetic terms. 

Einstein was clearly dissatisfied with this dualism, as is apparent from 
the following passage: 
If one views this phase of the development of the theory critically, one is struck 
by the dualism which lies in the fact that the material point in Newton’s sense and 
the field as continuum are used as elementary concepts side by side. Kinetic energy 
and field-energy appear as essentially different things. This appears all the more 
unsatisfactory inasmuch as, in accordance with Maxwell’s theory, the magnetic 
field of a moving electric charge represents inertia. Why not then total inertia? 
Then only field-energy would be left and the particle would be merely an area of 
special density of field-energy. In that case one could hope to deduce the concept 
of the mass-point together with the equations of the motions of the particles from 
the field-equations,—the disturbing dualism would have been removed.* 


This lack of unity in the physical foundations, which violates prescrip- 
tion (II), was reflected in the mathematical formulation of the theory. 
Einstein explains: 

The weakness of the theory lies in the fact that it tried to determine the phenomena 
by a combination of partial differential equations (Maxwell’s field equations for 


empty space) and total differential equations (equations of motion of point 
masses), which procedure was obviously unnatural. 


The dualism was made far worse by Newton’s classical Principle of 
1 Einstein [1934], p. 160. * Einstein [1949], p. 36. * 9 Einstein [1950], p. 75. 
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Relativity which applies to mechanics butZapparently not to electro- 
dynamics. In view of the Galilean transformation which physicists took 
for granted, Maxwell’s equations seem to presuppose the existence of an 
ether, or at any rate of a unique frame of reference in which they would 
hold good. Assessing Lorentz’s work, Einstein wrote: 

For him [¢.e. for Lorentz], Maxwell’s equations concerning empty space applied 
only to a given system of co-ordinates, which, on account of its state of rest, 
appeared excellent in comparison to all other existing systems of co-ordinates. 
This was a truly paradoxical situation since the theory appeared to restrict the 
inertial systems more than classical mechanics.* 

The Absolute Space Hypothesis, f.e. the assumption that among all 
inertial frames there exists a privileged one, is an idle metaphysical com- 
ponent of Classical Mechanics. That its elimination does not reduce the 
empirical content of Classical Dynamics was clearly recognised by Newton 
who wrote: “The motion of bodies included in a given space are the same 
among themselves, whether that space is at rest or moves uniformly for- 
ward in a right line without any circular motion.’® One could further main- 
tain that the Absolute Space Hypothesis was scientifically useless in that 
one could not even in principle define the Absolute Frame as that in which 
Newton’s laws of motion hold good; for if these laws are true in one of the 
inertial frames, they are automatically true in all.4 

With the advent of the wave theory of light, of Fresnel’s and Lorentz’s 
postulation of a stationary ether,® the situation changed dramatically. One 
could now define the Absolute or Ether Frame as that in which Maxwell’s 
equations are true. Given the old Kinematics and in particular the 
Galilean transformation, this definition singles out a unique frame in which, 
because Maxwell’s equations hold in it, light propagatesitself in all directions 
with the same speed c. The ether frame was taken to be inertial, so that in 
all other frames, whether inertial or accelerated, light would not have a 
constant velocity. This implied the possibility of devising experiments 
which might detect the ‘absolute’ motion of ponderable bodies. The 
experiment would be such that its outcome tells us whether the body in 
question was in motion or at rest in the ether. In this connection Michel- 
son’s experiment is typical: a null outcome would tell us that the earth is at 
rest in the ether, and from a shift of the fringes it would be concluded that 
the earth moves. In this particular case however we know that the earth 
changes its velocity with respect to the inertial frame determined by the 
1 Newton [1686], p. 20. * De Haas-Lorentz [1957], p. 7. 3 Newton [1686], p. 20. 
1 In the ‘Science of Mechanics’, which Einstein carefully read, Mach.attacked the concept 

of Absolute Space and went as far as proposing that even the distinction between inertial 


and non-inertial frames gught to be abolished (Mach [1883], chapter 2, vi). 
5 For more exact details, cf. Lakatos [1970], pp. 159-65. 
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stars, so the earth must at one point of its trajectory be moving in the ether. 
Hence we can predict that the experimental result must be positive. 

Why did Einstein find such developments in the evolution of physics 
‘paradoxical’? We have seen that Einstein disliked the dualism of particles 
and fields. The fact that the laws of mechanics (which govern the motions 
of particles) obey the Principle of Relativity, while Maxwell’s equations 
(which govern the behaviour of the field) do not, makes the dualism much 
worse. Given the problem-situation, there were to my mind two courses of 
action open to a unificationist like Einstein: he could maintain either that 
the Relativity Principle applies neither to mechanics nor to electrodynamics 
or else that it applies to both at the same time. In the first case he could 
have modified mechanics in such a way that it only holds in the ether frame; 
in the second case he would have to extend the Relativity Principle to 
electrodynamics. In its Galilean form, the Relativity Principle is in- 
applicable to electromagnetic theory. At this point however, Einstein’s 
critique of the induction experiment proved crucial in that it tipped the 
balance in favour of extending Relativity to electrodynamics and thereby 
modifying classical kinematics. 

To repeat, in the induction experiment there is complete symmetry 
between the two experimental results, which is at odds with the asymmetry 
introduced by the ‘theoretical’ explanation. To put it more pedantically, 
the observational statements describing the behaviour of the currents in the 
conductor are identical in the two cases, but the high-level explanations in 
terms of the accepted theory differ widely. There would be nothing 
intrinsically wrong in this state of affairs, had the asymmetry not been 
introduced through considerations of absolute motion which the Relativity 
Principle forbids. Seen from that angle however, the experiment suggests 
that an extension of the Relativity Principle to include electrodynamic 
phenomena might abolish the ‘theoretical’ asymmetry; it promises to make 
the symmetry between the two experimental outcomes appear, not as a 
fortuitous result, but as a direct manifestation of a general principle, the 
principle of Lorentz-covariance. In this he was following Prescription IT. 

In his [1905] Einstein concluded that: 


examples [like the induction experiment] together with the unsuccessful attempts 
to discover any motion of the earth relatively to the light medium, suggest that 
the phenomena of electrodynamics as well as of mechanics possess no properties 
corresponding to the idea of absolute rest. They suggest rather, as has 
already been shown to the first order of quantities, that the same laws of electro- 
dynamics and optics will be valid for all frames of references for which the 
equations of mecanics hold good. We will raise this conjecture (the purport of 


1] do not of course mean that the experiment was ‘crucial’ in the traditional sense of 
* refuting one theory while confirming another. 
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which will hereafter be called the Principle of Relativity) to the status of a postu- 
late and also introduce another postulate, which is only apparently irreconcilable 
with the former, namely that light is always propagated in empty space with a 
definite velocity c which is independent of the state of motion of the emitting 
body.? 

Note the two references made to mechanics, underlining the important part 
which Classical Relativity played in Einstein’s thinking prior to 1905.2 

In this passage Einstein alludes to the absence of any first-order effects 
of absolute motion, which Lorentz had explained in the Versuch through 
an early version of the theory of corresponding states. This first-order 
equivalence between observers, which runs counter to the preference given 
to a unique frame, must have increased. Einstein’s suspicion that the Rela- 
tivity Principle applies to electrodynamics as well as to mechanics; under 
the new theory the absence of first-order effects, instead of being a stray 
fact, would directly reveal the presence of a universal principle. 

What commended the Relativity Principle was therefore its universality, 
its unifying role in subsuming mechanics and electrodynamics under the 
same law and in providing a unified explanation for various features of 
phenomena such as the symmetry in the induction experiment and the 
absence of first-order effects due to the earth’s motion. 

The phrase: 

‘,.. together with the unsuccessful attempts to discover any motion of the earth 
relatively to the light medium’ 

has given historians and philosophers of science some problems.? It also 
seems inconsistent with the thesis of Section x that the Michelson experi- 
ment played a negligible role in the genesis of S.R.T. 

Einstein might be referring in the quoted phrase to Michelson’s experi- 
ment, which must have been in the back of his mind, if only through 
Lorentz’s [1895].* This is perfectly compatible with his assertion that the 
experiment came to his attention only after 1905. To my mind the above 
phrase is no more than a casual allusion to a number of results which he had 
registered without any surprise, for they anyway followed from his own 


1 Einstein and others [1923], p. 38. 

2 This was later confirmed in his more philosophical writings. (See De Haas-Lorentz [1957], 
quoted above, p. 229. Also cf. Einstein [1950], p. 55.) 

3 Griinbaum, for example, says: ‘Unless they provide some other consistent explanation 
for the presence of the latter statement in Einstein’s text of 1905, it is surely incumbent 
upon all those historians of Relativity Theory who deny the inspirational role of the 
Michelson-Morley experiment to tell us specifically what other “unsuccessful attempts to 
discover any motion of the earth relatively to the light medium” Einstein had in mind 
here.’ (Cf. Pearce Williams [1968], p. 114.) 

“He admitted to Shankland that ‘he had also been conscious of Michelson’ a result before 
1905, partly through his readings of the papers of Lorentz and more because he had 
simply assumed this result of Michelson to be true’. (Holton [1969], p. 154.) J 


as 
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conjectures. On this point I agree with Holton?; for otherwise Einstein 
would certainly have cited Michelson’s result in support of his second 
postulate, the Light Principle P2. This postulate presents us with a new 
difficulty. 

Unlike his first postulate (the Relativity Principle) Pz, the Light 
Principle P2 is thrown out with no justification whatever. Moreover, on the 
face of it, P2 runs counter to Einstein’s prescription (Z)*: there seems to be 
no connection at all between the fundamental properties of space-time and 
those of light. Why should purely kinematical considerations involve c? 
The light principle is quite a low-level statement which is not as yet 
integrated into a more general system. It is precisely for this reason that 
philosophers and scientists supposed that Einstein was obeying the dictate 
of experience, basing his second postulate on Michelson’s result. Later on 
in his [1905], Einstein does say that the light principle is in agreement with 
experience: he had after all heard of various experiments trying to detect 
the earth’s absolute motion. Nowhere, however, does he assert that experi- 
ence had suggested the second postulate, or even made it look plausible to 
him. 

I think the problem can be solved simply by examining more carefully 
Einstein’s later writings—in particular his Autobiography—and then 
comparing them with his [1905]. In his [1934] Einstein writes: 

Then came the Special Theory of Relativity with its recognition of the physical 
equivalence of all inertial systems. In conjunction with Electrodynamics or the 
law of propagation of light, it implied the inseparability of space and time.’ 
Perhaps the most illuminating passage occurs in Einstein’s [1950]: 

The second principle on which the Special Relativity theory rests is that of the 
constancy of the velocity of light in the vacuum. Light in a vacuum has a definite 
and constant velocity, independent of the velocity of its source. Scientists owe 
their confidence in this proposition to the Maxwell-Lorentz theory of electro- 
dynamics.® 

Also, in his [1949], Einstein tells us about a thought-experiment in which, 
at about the age of sixteen, he imagined himself to be following a ray of 
light at speed c: 

If I pursue a beam of light with a velocity c (velocity of light in a vacuum), I 
should observe such a beam of light as a spacially oscillatory electromagnetic 


1 Cf. Holton [1969], pp. 164-5. 3 Cf., above, p. 224. 

3 Cf. below, pp. 233-4. t Einstein [1905], p. 40. 

5 Einstein [1934], p. 143 (my translation). In his [1949] Einstein again says “The Special 
Theory of Relativity owes its origin to Maxwell’s equations of the electromagnetic field. 
Inversely the latter can be grasped formally in satisfactory fashion only by way of the 
Special Theory of Relativity. Maxwell’s equations are the simplest Lorentz invariant 
field equations which can be postulated for an anti-symmetric tensor derived from a 


* vector field.’ (p. 62). * Einstein [1950], p. 56; my italics. 
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field at rest. However, there seems to be no such thing, whether on the basis of 
experience or according to Maxwell’s equations. From the very beginning it 
appeared to me intuitively clear that, judged from the standpoint of such an observer, 
everything would have to happen according to the same laws as for an observer who, 
relatively to the earth, was at rest. For how, otherwise, should the first observer 
know, t.e. be able to determine, that he is in a state of fast uniform motion.? 


What is most striking about this passage is the conclusion which 
Einstein draws from his thoughtexperiment. He does not restrict him- 
self to what seems warranted by the experiment, namely that ¢ is an un- 
attainable speed or that the addition law of velocities must break down. He 
immediately jumps to a general conclusion, or rather puts forward the 
sweeping conjecture: the laws of physics—more specifically those of electro- 
magnetism—would have to be the same for the moving and for the station- 
ary observers. Both historically and epistemologically speaking, Einstein’s 
second starting point—the first one being the Relativity Postulate—is not 
the Light Principle but the proposition: 


(P3) Maxwells equations express a law of nature; 


in virtue of Pr, they must therefore assume the same form in all inertial 
frames. Maxwell’s equations imply that, within each co-ordinate system in 
which they hold, electrodynamic disturbances propagate themselves with 
velocity c, which velocity must therefore be an invariant. Thus Pr and P3 
imply P2. 

In the electrodynamical part of the 1905 paper. Einstein does in fact 
suppose that Maxwell’s equations are Lorentz-covariant and then deduces 
the transformation laws for # and H. He does not try to infer P3 from Pr 
and P2; so the electrodynamic part, by exhibiting a transformation which 
makes Maxwell’s equations covariant, simply established that the latter are 
compatible with the Relativity Postulate and the light-principle. It is a 
consistency proof. Although P3 is a stronger statement than P2, it is more 
plausible and incidentally less counter-intuitive. In accordance with (I), 
P3 derives its plausibility from being a unified, well-knit theory in which 
the primitive concepts (electric field, magnetic field, charge density) are all 
closely connected; also it had been tested for a whole generation prior to 
1905. Thus the logical order is reversed through a priori heuristic considera- 
tions: P3 is more plausible, though stronger, than P2. 

Another piece of evidence which confirms the view that Einstein 
approached the problem of Relativity through Maxwell’s equations and 
their covariance is to be found in. Lorentz’s [1895]. In a part of this work 
which is completely independent of Michelson’s experiment and of the 


1 Einstein [1949], p. 53; my italics. 
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Contraction Hypothesis Lorentz had proved that no first-order effects of 
the earth’s motion can be detected. Neglecting all terms in (2); he used a 
limiting case of the Lorentz transformation; he then found transformation 
laws for the field Ë H, under which the equations take on a form very 
similar to, and in some cases identical with, the form assumed in the ether 
frame. It is not far-fetched to suppose that Lorentz’s techniques made a 
strong impression on Einstein; they might well have led him to wonder 
whether a more general transformation would yield complete covariance 
together with the result that no effects whatever arise from uniform 
rectilinear motion. Lorentz himself was to attempt this solution in his 
[1904], which Einstein did not read before publishing the ‘Electrodynamics 
of Moving Bodies’. 

However, Lorentz’s programme for a Theory of Corresponding States 
was outlined in the Versuch and one is struck by the similarity between the 
methods used by Lorentz and by Einstein. In his [1905] the latter first 
constructed a transformation law for the coordinates x, y, z, t; then, 
assuming the co-variance of Maxwell’s equation, he deduced the trans- 
formation laws for E, Ë and p. Lorentz’s influence on Einstein cannot be 
overrated; it was not Michelson, the experimentalist, but Lorentz, the 
theoretician, who played a considerable inspirational role in the genesis of 
Special Relativity. This is indicated in the all-too-brief second paragraph 
of Einstein’s [1905] by the clause: ‘as has already been shown to the first 
order of small quantities.’ 

We have seen that Einstein rejected Lorentz’s classical approach, but he 
made use of Lorentz’s tremendous technical achievement, albeit under 
very different kinematical assumptions. I have mentioned that Lorentz’s 
[18924] already contains the full Lorentz transformation up to a constant 
factor in the expression of z’.1 Einstein’s greatest contribution was to extend 
Lorentz’s methods and give the transformed quantities a realistic inter- 
pretation in the ‘moving’ system. 

One might still wonder why Einstein did not start by postulating Pr and 
P3 instead of Pr and P2. He had, I think, at least two good reasons for 
presenting the new theory in the way he did. On the one hand it is prefer- 
able, from the logical point of view, to use the weaker assumption P2 which, 
in conjunction with Pz, suffices for developing a new kinematics and deriv- 


1 Cf, Part I, p. 112. 

3 It is now finally clear why Einstein could rightly claim that Michelson’s experiment had 
been quite irrelevant to his work and that he could easily have anticipated its null out- 
come. That almost nothing is cited in support of the Light Principle may be due to the 
fact that it follows from a well-corroborated hypothesis (Maxwell-Lorentz equations) 

. together with the Relativity Postulate Pr, for whose acceptance Einstein had already 
argued. (Cf. Part I, pp. 107-8.) 
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ing the Lorentz transformation. On the other hand Einstein had come to 
the conclusion that Maxwell’s equations, although true of macroscopic 
phenomena, did not provide the ultimate foundations for the whole of 
physics.1 He might therefore have wanted to make his space-time system 
independent of electrodynamics.? 

One question remains unanswered. Einstein faced a problem forced on 
him by the incompatibility, if taken together, of the following three 
hypotheses: 

(Pr) The Relativity Postulate 

(N) Newton’s second law of motion 

(P3) The Maxwell-Lorentz equations? 

His solution consisted in modifying N, or rather in replacing N by a new 
theory N’ such that Pr and N’ and P3 are consistent. Considering that the 
(Galilean) Relativity Principle was first shown by Newton to hold for 
mechanics, it is puzzling that Einstein seems never to have envisaged 
keeping N and substituting for P3 a new set of equations P3’ covariant 
under the Galilean transformation. P3’ would of course have had to yield 
P3 as a limiting case.4 


1 Einstein did not accept Lorentz’s (tentative) assumption that all physical phenomena 
could be explained in terms of charges and fields governed by Maxwell’s equations. 

3 He wrote in his [1955]: ‘I knew only of Lorentz’s works in 1895—‘‘La Théorie Electro- 
magnétique de Maxwell” [this is in fact Lorentz’s [1892a]] and “Versuch einer Theorie der 
electrischen und optischen Erscheinungen in bewegten Koerpern’’—but not Lorentz’s later 
works, nor the consecutive investigations by Poincaré. In this sense my work of 1905 was 
independent. The new feature of it was the realisation of the fact that the bearing of the 
Lorentz transformation transcended its connection with Maxwell’s equations and was 
concerned with the nature of space and time in general. A further new result was that 
Lorentz-invariance is a general condition for any physical theory. This was for me of 
particular importance because I had already previously found that Maxwell’s theory did 
not account for the micro-structure of radiation and could therefore have no general 
validity.’ Einstein indicates that a connection between the Lorentz transformation and 
Maxwell’s equations clearly existed but was then transcended. The logical picture seems 
to be as follows: 

Pr and P3 => Pr and P2 

Pr and P2 = new kinematics and Lorentz-transformation equations. 

Pr and Lorentz-transformation equations => requirement of Lorentz-invariance for 

all physics 

The connection between Maxwell’s theory and the Lorentz-transformation is given by: 

Pr and P3 => Lorentz equations 
This connection is transcended by the result that, from the Relativity Principle Pr and 
the Lorentz transformation equations, there follows a new structure of space-time and a 
condition of Lorentz-invariance which applies not only to Maxwell’s equations but to 
the whole of physics. 

3 Of course there is the underlying and common assumption that the law of inertia should 
hold in all allowable frames. This is precisely why these frames are called ‘inertial’. 

4W. Ritz adopted this approach. Rather than adjusting the whole of physics to electro- 
dynamics, he tried to alter electrodynamics so ag to make it Galileo-covariant. He looked 


upon the field quantities É and H as intermediate quantities which enable one to 
compute the Lorentz force F = = (E+? A i /c). In the last analysis only the particles, * 
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This is all the more intriguing because he realised that Maxwell’s 
equations were not as fundamental or as ultimate as Lorentz had taken 
them to be.t In other words: why did Einstein throw in his lot with Max- 
well rather than with Newton? In the end Einstein guessed; it was not 
however as uninformed a guess as it might at first appear. 

Einstein was dissatisfied with the further and fundamental dualism 
between fields and particles which beset Lorentz’s theory. In virtue of his 
Prescription, I it seemed obvious to Einstein that one component of the 
dual structure ought to be reduced to the other. But which one? He found, 
as he explained in his Autobiography, that all attempts at a mechanical 
explanation of the behaviour of the field had failed.* 


their masses, charges and relative velocities are real. Everything else is scaffolding. Ritz 
then considers the equation found by Schwarzschild for the Lorentz force which 
one accelerating charged particle exerts on another. (Cf. Ritz [1911], p. 378.) This 
equation is not Galileo-covariant because, among other things, it involves the ‘absolute’ 
velocities of the particles, f.e. their velocities relatively to the underlying frame of refer- 
ence. Since such absolute velocities are altered by a Galilean transformation, Ritz sets out 
the following programme: alter Schwarzschild’s equation in such a way that the new 
equation involves only the accelerations and the relative velocities of the particles. In his 
general equations Ritz left three functions: ¢, $ and x totally undetermined; he then 
proposed to adjust these functions so as to account for known experimental results. 
Ritz’s programme did not attract many disciples (cf. O’Rahilly [1965], Chapter x1); a 
Kuhnian might be tempted to attribute this to his early death, but then there are in fact 
good objective reasons for the lack of interest among other scientists. First, in view of the 
proposed adjustment of parameters (which in this case happen to be functions), his 
programme held out no promise for empirical progress. Secondly, even when he did 
adjust the parameters, he did so only in order to match results which Lorentz had already 
obtained, (Cf. Ritz [1911], p. 416.) 

1Tt is also intriguing due to the fact that, according to MacCormmach, Einstein was 
initially inclined to regard mechanics as the most fundamental branch of physics. (Cf. 
MacCormmech [1970].) 

2 Einstein adduces from Planck’s quantum hypothesis a second reason for abandoning 
mechanics in favour of electrodynamics. Although this reason seems to be a post hoc 
rationalisation, I shall quote Einstein in full: ‘[Planck’s] form of reasoning does not make 
obvious the fact that it contradicts the mechanical and electrodynamic basis, upon which 
the derivation otherwise depends. Actually, however, it presupposes implicitly that 
energy can be absorbed and emitted by the individual resonator only in quanta of 
magnitude A, i.e. that the energy of a mechanical structure capable of oscillations as well 
as the energy of radiation can be transferred only in such quanta—in contradiction of the 
laws of mechanics and electrodynamics. The contradiction with dynamics was here funda- 
mental: whereas the contradiction with electrodynamics could be less fundamental. For the 
expression for the density of radiation energy, although it is compatible with Maxwell's 
equations, is not a necessary consequence of these equations.’ (Einstein [1949], p. 45; my 
italics.) 

‘This passage indicates that Einstein gave precedence to Maxwell over Newton and took 
his starting point with electrodynamics rather than with mechanics. Nevertheless, 
having accepted Planck’s quantum hypothesis, Einstein could not regard Maxwell’s 
equations as fundamental. Thus Einstein had enough reservations about electrodynamics 
to avoid making it into a cornerstone of his kinematics. Although P3 implies and lends 
plausibility to the, Light Principle, the latter is still more fundamental in the sense of 
applying both to micro- and macroscopic phenomeng; Einstein’s lucky and unexplained 
guess was that the invariance of c was a universal principle which transcends its obvious 

* dependence on Maxwell’s equations, j 


-~ 
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If mechanics was to be maintained as the foundation of physics, Maxwell’s 
equations had to be interpreted mechanically. This was zealously but fruitlessly 
attempted, while the equations were proving themselves increasingly fruitful. 


We have seen how Einstein arrived at his programme and hence at 
S.R.T. We shall see that this programme finally superseded Lorentz’s in 
the strictly empirical sense in 1915.* But was Einstein’s programme object- 
ively superior to Lorentz’s in 1905? Did Lorentz’s programme, as is gen- 
erally claimed, really collapse in the face of S.R.T.?3 


2.3 The Heuristic Superiority in 1905 of the Relativity Programme: 
Einsteins Convariance versus Lorents’s Ether. The Power of Einstein’s 
Heuristics: Derivation of a New Relativistic Law of Motion and of E = me.? 

As I have already shown,‘ Lorentz’s theory T, is observationally equiva- 
lent to the S.R.T.; Einstein’s transformed coordinates can be interpreted 
as the measured coordinates in Lorentz’s moving frame. In the latter the 
‘real’ coordinates are still the Galilean ones: x, = x—vt, Y, = Y, Zr = 7, 
t, = t; but, due to the contraction of measuring rods, to time-dilation and 
to the synchronisation of clocks through light signals, the measured co- 
ordinates are: 

x = B(x—vt), y = y, z = z, t = B(t—ox/c*), 

where B = r/4/r—v*/c* 

Thus, as he indicated at the end of his Theory of Electrons, Lorentz was 
in a position so to reformulate his theory that no ‘crucial’ experiment be- 
tween his system and Einstein’s could have been devised in 1905.5 

In view of this situation, why did brilliant mathematicians and physicists 
like Minkowski and Planck abandon the classical programme in order to 
work on Special Relativity? Given the lack of any crucial experiment, a 
Kuhnian account of the ‘conversion’ of Planck and others may seem 
plausible. But the idea of a new bandwagon is highly implausible. First, 
Einstein was a relatively unknown figure while Lorentz was a recognized 
authority. Secondly, Lorentz’s theory was eminently intelligible whereas 
1 Einstein [1949], p. 25. a Cf. below, section 3. 

3 Cf. Part I, p. 116. “Cf. Part I, pp. 120-1. 

* For the argument that follows, I do not even need the assumption that T, and S.R.T. are 
observationally equivalent. It is enough that: (r) between 1905 and 1908 no ‘crucial’ 
experiment between the two rival theories was cartied out; and (2) neither hypothesis 
logically implies the other. (z) is a historical fact; as for (2), Lorentz proposed a ‘model’ 
bf the electron as a spherical distribution of charge in the ether, while Einstein remained 
agnostic as to the shape, charge density and mass of the electron; on the other hand 
Einstein asserted that all physical laws are Lorentz-covariant whereas Lorentz restricted 


his attention largely to glectrodynamics (and did not fully establish the covariance or 
Maxwell’s equations). 
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Einstein’s involved a major revision of our most basic notions of space and 
time. Thirdly, there was no build-up of unsolved anomalies which Einstein’s 
theory dissolved better than Lorentz’s.1 Moreover, at the time when 
Planck was converted, that is in 1906, no bandwagon had started.? Nor did 
the leading protagonists of the old paradigm die out unconverted as Kuhn 
claims is generally the case. Lorentz himself was eventually converted to 
the new outlook. In the Theory of Electrons, first published in 1909, he gives 
essentially the same account of the theory of corresponding states as in his 
Electromagnetic Phenomena of 1904; however, the footnotes indicate that by 
1915 he had already accepted the Relativity Principle. 

Kuhnian explanations of the victory of S.R.T. do not work. Another 
explanation is Whittaker’s.3 He tackles the difficulty by considering 
Lorentz and Poincaré as the real authors of Special Relativity, leaving to 
Einstein the merit of proposing a new theory of gravitation (i.e. General 
Relativity). Thus Lorentz’s ether programme was not defeated by, but 
developed into, the Relativity programme. However interesting and 
plausible this explanation may seem in the light of the foregoing discussion, 
it is unacceptable. As will be shown, the two programmes possess very 
afferent heuristics. 

The most commonly held explanation is the third one. According to this, 
Einstein’s theory represented the success of positivism in ridding classical 
physics of redundant metaphysics.‘ 

I shall both develop this positivist claim and present my answer to it by 


1 Did Lorentz face insuperable difficulties which were known to his contemporaries? We 
have seen that Lorentz used the ‘Molecular Forces Hypothesis’ in order to obtain the 
appropriate laws about rod-contractions and clock-retardations. He had thereby assumed 
a transformational similarity between electromagnetic and molecular forces. In view of 
his programme, his next most natural step would have been to give a precise classical law 
of force for molecular and atomic interactions. We have also seen that, in order to explain 
the variation of the inertia with the velocity, Lorentz accounted for the mass of an electron 
in electromagnetic terms (electromagnetic longitudinal and transversal masses). In other 
words, in producing the revolutionary results which either matched or even anticipated 
Einstein’s, Lorentz had to give a classical account of elementary particles. Lorentz was 
therefore unlucky in his choice of problems: he was straight away involved in difficulties 
which were to defeat Einstein himself, but at a much later stage. With hindsight we can 
see why Lorentz would probably have failed anyway; he was overtaken by the quantum 
theorists who realised that classical laws and in particular Maxwell’s equations, did 
not explain atomic stability. However, in 1905 there was hardly any indication that 
Lorentz could go no further in developing his programme and that no satisfactory classical 
account could be given of the behaviour of elementary particles. Yet, already in 1906, a 
physicist of Planck’s stature and conservatism abandoned the classical approach, knowing 
very well that the S.R.T. might well have been refuted by Kaufmann’s experiment. 
Planck’s choice, if rational, must have been guided by considerations different from the 
ones just given. 

* A Kuhnian might fall back on individual Gestaltswitches: but if so, the Gestaltswitches 
would be different for Planck, for Minkowski, for Sommerfeld, for Lorentz! 

3 Cf. Whittaker [1953], Chapters II and V. 

t Cf. Bridgman [1936], pp. 7-9, von Laue [1952], p. 6, Eddington [1939], pp. 70-5. Also 
Eddington [1920], pp. 1-16. 
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comparing the Einsteinian revolution with the Copernican (or rather the 
Keplerian) one. This comparison will point to a feature which has often 
been a symptom if not a cause of decline in the heuristic of research pro- 
grammes. This feature is a divorce between the empirical content and the 
mathematical formulation of certain scientific hypotheses: these hypotheses 
contain a large number of (physically) uninterpreted mathematical entities. 
According to positivists like Mach, the mere elimination of such entities 
increases simplicity and thereby constitutes progress. My claim is that such 
eliminations are by-products of new research programmes whose heuristic 
eliminates certain entities. This may be accompanied by a—contingent— 
increase in simplicity. Let us take the example of the Copernican Revolu- 
tion. f 

The Platonic programme of saving the phenomena by the use of circular 
and spherical motions was initially successful: to each mathematical entity 
corresponded a physical one. Each planet was fixed on a physically real 
crystalline sphere which performed a number of axial rotations. It was 
however discovered that the distance between the earth and a given planet 
varied, so the astronomers resorted to eccentrics, epicycles and equants in 
order to account for the new phenomena. The physical problem was to 
determine the motion of the heavenly bodies relative to the earth. Since the 
paths of the planets are non-circular and since their motion is non-uniform, 
a widening gap appeared between the physical problem and the mathe- 
matical methods, which allowed only for circular motions. Although the 
earth allegedly occupied the centre of the universe, the paths of the planets 
about the earth were not dealt with directly; epicycles, deferents and 
equants, all of which had no ‘physical reality’, were introduced in order to 
predict astronomical data; both the centre of an epicycle and the punctum 
equans are empty points in space. 

Copernicus did not heal this rift between the physical picture and the 
mathematical description. True, he got rid of the equant; but, although his 
problem was to determine the motion of the planets with respect to a fixed 
sun, he interposed between the sun and the planets roughly as many 
epicycles with as many empty centres as were involved in the Ptolemaic 
system. It was left to Kepler to investigate the direct relation between the 
sun and the planets, to abolish epicycles and to find that the planets 
describe ellipses with one focus at the centre of the sun. 

Let us now return to Lorentz. We have seen that the Lorentz-trans- 
formation is always carried out in two steps. 


1 This is so to speak the obverse of the point made earlier about the’second heuristic role 
of mathematics in physics. There we saw how new physical theories can be constructed 
by interpreting hitherto uninterpreted mathematical entities. (Cf. Part I, pp. 109-11.) | 

3 Cf. Part I, p. 117. 
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‘The first step yields the Galilean coordinates: 


X, = X—U1, Yr = Y, Zr = 2,1, = t 

The second one gives us the effective coordinates: 

x’ = B(x—ot), y = y, 2’ = z, t = B(t—vx/c*) 

The Galilean coordinates are interposed between the absolute coordinates 
and the effective ones in the same way as various epicycles were placed 
between the earth or the sun on the one hand and the planets on the other. 

In a moving frame only the Galilean coordinates are taken by Lorentz to 
be ontologically ‘real’ in the same way that before Kepler only circular 
motions were considered permissible from a metaphysical point of view. 
These metaphysical assumptions were naturally reflected in the mathe- 
matics: in the Galilean transformation used by Lorentz and in the epicycles 
used by Ptolemy and Copernicus. The Galilean transformation is a vestige 
of the original aim of the Classical Programme, namely the aim of giving to 
the ether frame a privileged status. (Because of the Galilean transformation, 
Maxwell’s equations hold good only in the ether frame.) The assumption 
of an ether frame no longer has any observational cash-value. Similarly the 
Ptolemaic epicycles were reminders of a hope which had.long vanished, the 
hope of finding that the motions of the planets are both uniform and 
circular. 

Copernicus was aware that the motions of the planets are neither circular 
nor uniform and Lorentz later realised that the effective coordinates, and 
not the Galilean ones, are the measured quantities in the moving frame. 

I have drawn a parallel between Copernicus and Lorentz. Kepler and 
Einstein can be. similarly compared. Kepler’s greatest contribution to 
astronomy allegedly consisted in eliminating epicycles and in showing that 
the ‘real’ paths of the planets are ellipses with one focus at the sun. 
Similarly, according for instance to Bridgman and to von Laue, Einstein’s 
chief merit lay in abolishing the Galilean transformation and in identifying 
the effective or measured coordinates as the only real ones. In equating 
‘to be’ with ‘to be perceived or measured’ Einstein is supposed to have 
carried out a positivistic revolution in physics. However, if the merit both 
of Kepler and of Einstein only consisted in ridding physics of unnecessary 
1 By drawing a parallel between the Galilean transformation on the one hand and a system 

of epicycles on the other, I do not want to suggest that, in Lorentz’s theory T;, the 
Galilean transformation is physically uninterpreted. In fact, even the epicycles can be 
interpreted in the following trivial way: God, in contemplating his creation, sees it as a 
huge system of interlocking circles. Similarly, in Lorentz’s case, God would perceive an 
infinite extended substance, the ether, in which any two events are separated by ah 
absolute time interval. Such interpretations, which do not increase the empirical content 
of existing theories, could conceivably be made useful by indicating how they are to be 


heuristically exploited in order to construct new physical theories. Lorentz did not give 
“such an indication in connection with the Galilean transformation. (Cf. below, p. 243.) 
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‘epicycles’, then the importance of these two physicists in the history of 
science is very much overrated: Copernicus and Lorentz did all the creative 
work, and Kepler and Einstein only applied Occam’s razor in order to 
demolish the expendable metaphysical scaffolding used by their pre- 
decessors. Moreover, Copernicus knew that the paths of the planets were 
not circular, hence that his epicycles were part of the scaffolding; Lorentz 
realised that he did not need the Galilean coordinates in order to deduce 
the null results which he set out to explain. If so, Kepler and Einstein 
contributed to the economy of thought and not to the growth of knowledge. 

This is an unacceptable conclusion. Let us start with Kepler. 
Copernicus’s account of the motion of heavenly bodies had been largely 
Aristotelian in character: because the planets are perfect spheres, their 
natural motion is both uniform and circular. Through trying to give a 
dynamical explanation of the motion of heavenly bodies, Kepler provided 
classical astronomy with its heuristic. He proposed to determine the forces 
which emanate from the sun and directly act on the planets. He abolished 
epicycles because he wanted nothing but forces to mediate between the sun 
and the planets. Circles centred on empty points did nothing but conceal 
the ‘true’ relation which linked one heavenly body to the other. Kepler 
proposed a dynamical theory which is now largely forgotten because it was 
contradicted and supplanted by Newtonian astronomy. But in forgetting 
Kepler’s dynamical theory we should not forget that Kepler created the 
programme which culminated in the Newtonian system; Kepler’s method 
consisted in trying to discover the law of force responsible for the periodic 
motion of the planets round the sun. Getting rid of Copernican epicycles was 
not an end in itself: it was subordinate to the needs of the new heuristic. 

Einstein, like Kepler, created a programme, not only an isolated theory. 
We shall see that Einstein’s heuristic is based on a general requirement of 
Lorentz-covariance for all physical laws; we recall that the Lorentz-trans- 
formation sends (x, y, z, t) directly into (x’, y’, 2’, t') without passing by the 
Galilean coordinates x,, y,, Zm t,. The new heuristic therefore requires the 
abolition of the Galilean transformation which plays the role of a cumbersome 
epicycle. The parallel with Kepler is complete. 

After these criticisms of the Kuhnian, Whittakerian and positivist 
‘explanations’ of the Einsteinian revolution, let me venture my own. In my 
view the main difference between Lorentz and Einstein lies in the difference 
between the heuristics of their respective programmes. The ether programme 
did not collapse but was superseded by a programme of greater heuristic power. 
This greater heuristic power explains why Planck and others joined 
Einstein’s programme before it became empirically progressive. The 
difference between the two theories cannot be appreciated by taking an 
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instantaneous look at Lorentz’s and Einstein’s systems. One has first to 
imbed them in their respective programmes. In this way one realises that 
the two theories are similar because they stand at the intersection of two 
research programmes which later diverged. It will further be shown that 
the difference between the two approaches did not emerge with hindsight 
but guided the deliberate choice of scientists at the beginning of this 
century.! 

Lorentz, unlike Einstein, did not create the heuristic of his own pro- 
gramme. The heuristic of Lorentz’s programme consisted in endowing the 
ether with such properties as would explain the behaviour both of the 
electromagnetic field and of as many other physical phenomena as possible. 
In view of the overwhelming success of Newtonian dynamics it is hardly 
surprising that the ether was supposed to possess primarily mechanical 
properties. The ether programme developed rapidly in certain respects, yet 
towards the end of the.nineteenth century its positive heuristic was running out 
of steam. A succession of mechanical models for the ether were proposed 
and abandoned. One serious difficulty was the presence in these models of 
longitudinal as well as transversal waves.* Lorentz faced a daunting 
problem of a different sort: in order to explain certain electromagnetic 
phenomena he postulated an ether at rest. He considered a portion of the 
ether, calculated the resultant R of the Maxwellian stresses acting on its 
surface and found that R is generally non-zero. Hence, if he was to assume 
that the ether was anything like an ordinary substance, he would have also 
to suppose that it was in constant motion. But this contradicted his original 
assumption of an ether at rest. He concluded ‘that the ether is undoubtedly 
widely different from all ordinary matter’ and that ‘we may make the 
assumption that this medium, which is the receptacle of electromagnetic 
energy and the vehicle for many and perhaps for all the forces acting on 
ponderable matter, is, by its very nature, never put in motion, that it has 


1 I have reached the seemingly paradoxical conclusion that both Einstein (and Planck) on 
the one hand and Lorentz on the other were perfectly rational in doing what they did, że, 
in doing opposite things. Let me immediately add that they were rational, given their 
metaphysical positions, The conflict between Lorentz and Einstein is, among other things, 
the age old conflict between two metaphysical doctrines which, Polanyi notwithstanding, 
do not belong to the tacit component but can be articulated. Lorentz held that the universe 
obeys intelligible laws (e.g. wave processes presuppose a medium, there exists an absolute 
‘now’ etc.) and Einstein held that the universe is governed by principles which can be 
given a mathematically coherent form. (e.g. all laws are covariant.) All major scientific 
revolutions were accompanied by an increase of mathematical coherence together with a 
(temporary) loss of intelligibility. (This applies to the Copernican, to the Newtonian, to 
the Einsteinian and to the quantum-mechanical revolutions.) It can moreover be argued 
that intelligibility is a time-dependent property, while mathematical coherence is not. 
We still consider Newtonian astronomy more coherent than Ptolemaic astronomy; but 
action-at-a-distance was unintelligible before Newton, became perfectly intelligible at 
the end of the eighteenth century, and again unacceptable after Maxwell. 

2 Cf. Whittaker [1951], Chapter V. 
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neither velocity nor acceleration, so that we have no reason to speak of its 
mass or of forces that are applied to it’1 In other words Lorentz had 
reached a point where the behaviour of the electromagnetic field dictated 
what properties the ether ought to have, no matter how implausible these 
properties might be: for example the ether was to be both motionless and 
acted upon by non-zero net forces. The ether was nothing but the carrier 
of the field. This involved a reversal of the heuristic of Lorentz’s programme: 
instead of learning something about the field from a general theory of the ether, 
he could only get at the ether post hoc by way of the field. In the case of the 
M.F-.H., for example, Lorentz first studied the transformational properties 
of the electromagnetic field; only then did he extend these properties to 
other molecular forces. Instead of positing one medium endowed with 
certain properties from which all forces inherit some common characteristic, 
we have an electromagnetic field acting as the archetype which determines 
the respects in which all forces are similar. 

I do not claim that the ether programme was beyond redemption. Of 
course there was no obvious reason why the postulation of some non- 
mechanical properties of the ether should not account both for electro- 
magnetic phenomena and for molecular interactions. All I claim is that the 
heuristic, as it stood, had petered out and that the ether programme was in 
need of a ‘creative shift?*—a shift which, as a matter of fact, Lorentz did 
not provide. 

Einstein based his heuristic on the requirement that all physical laws 
should be Lorentz-covariant; i.e. all theories should assume the same form, 
whether they are expressed in terms of x, y, z, t or in terms of x’, y’, 2’, 2’. 
But it would be practically impossible to discover new laws simply by 
looking out for all the equations which are covariant under the Lorentz 
transformation. A good method is to start from well-tested laws whose past 
success would anyway have to be explained by any new theory. Thus the 
heuristic of Einstetn’s programme is based on two distinct requirements: (I) a 
new law should be Lorentz-covariant and (2) it should yield some classical law 
as a limiting case. 

We have just seen that Lorentz used the ether in order to extend certain 
properties of the electromagnetic field to molecular forces. His methods 
were effective in explaining Michelson’s and other null results. By requiring 
that all forces and not only the electromagnetic and the molecular forces 
obey the same transformation laws; by taking Maxwell’s equations and 
imposing their transformation properties on the whole of physics, Einstein 


1 Lorentz [1909], p. 30. 
* This is a technical tern) in the methodology of scientific research programmes: cf. 
Lakatos [1970], p. 137. E 
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both strengthened those Lorentzian methods which had proved effective in 
particular cases and turned them into a heuristic of general applicability. In 
this sense Einstein’s programme displayed greater heuristic power than 
Lorentz’s. 

Let us give a more formal rendering of these two requirements. Let 
R(a, ag, . . ., An) = O be an equation which constitutes a physical law in 
some inertial frame J. If J’ is any other inertial frame in which the quan- 
tities di, dg,...,@,_, assume the values aj, a,..., a, respectively, then 
by the Relativity Principle: 

(z) [R(ai, a5, ..-, a) = 0] <> [R(a,, dg, . > o, An) = 0] 

But as Kretschmann pointed out to Einstein, every empirical Jaw can be 
given not only a Lorentz-covariant but also a generally covariant expression 
(of course, general covariance implies Lorentz-covariance).1 Thus, on the 
face of it, the most distinctive requirement of Einstein’s heuristic is empty. 
However the requirement is only trivialised if one is allowed complete 
freedom in reformulating the law. If one is restricted to a given number of 
entities @,, dg,...,@,, then the covariance requirement, far from being 
empty, becomes a stringent condition. As we shall see, in each particular 
case in which the heuristic is applied, the entities involved in the covariant 
law are precisely those involved in the corresponding classical law.? 

Now we consider the requirement that a new relativistic law should 
yield the corresponding classical theory as a limiting case. In the most 
general case laws will involve the speed of light, the velocities 0, . . ., Ön of 
a finite number of particles or processes and some other quantities a, b,.... 
If R = o and K = o are the relativistic and classical laws respectively, we 
require that: 


R > Kas (v,/c, o/c, ..., n/c) > (0, 0, . . «5 0). 

There are at least two ways of letting m/c tend to zero for m = 1, 2,..., M. 
First we take c to be a constant and let (v,,..., ,) approach zero. In this 
case We put Ùm = Um/c, for all m = 1, 2,..., and consider both R and K 
as functions of c, #,,.. ., Wm a, b,.. 

In other words, we write: 

R= R(c, @,,..., ®,, a, b,...)and K = K(c, W, ..., Da a, b,...) 

We then make 
R(c, Ùy ..., m 4, b, . . .)—K (6, Wy ..., Op, @, by...) 


1 Cf. Kretschmann [1917] and Einstein [1918]. . 

2? This problem arises also in the case of General Relativity where a different set of restric- 
tions again render the covariance principle non-empty. (Apart from the energy tensor 
Tav only the gay’s and their first and second order derivatives can occur in the field 

” equations.) 
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approach zero a8 w,;..., W, simultaneously tend to zéro. It is of course 
tacitly assumed that R and K are continuous functions, Hence 

Ric, Toy « « +) Dy, a, b,.. )—K(c, Üy.. -y Wn A, b, .) 
approaches 
Rc, 0, ..., 0, a, b,...)—K(c, 0, ..., 0, a,b, ...) 
as (Wi... Wn) approaches i ...,0). Thus the second requirement 
reduces to the equation: 
R(c, 0, . . ., 0, a, b, . . .) = K(c, 0, . . . 0, a, b,...). 

In this first case the function R, which is to be determined, will therefore 
be subjected to the following two conditions: 

(1°) [R(e, By -> Way a, 5...) = 0] <> [R(c, WH, ..., Dh, 0’, 5...) = 0] 
(Relativity Principle) 

(2) R(c, 0, ... 0, a, b, .. .) = K(c, 0, . . ., 0, a, b, ...) 
(Ramirement thal the elasticat lai be a Gesttiag caso of the neii lati) 

We recall that 

R(c, Wy +) yy a, b, . . .) = R(c, Dy/c, ..., 0, /C, a, b,...) = 0 
is the relativistic law which i is to replace the classical equation 

K(c, Wy +s +) Dm 2 b, . . .) = K(e, Ds ley iiss d,/¢, a, b, . . .) = 0. 
If the relativistic law holds wood in general, it will in saeco be true for 
vanishing velocities: te. for v,==...=0,—=0 or equivalently for 
w, =... = W, = 0. By (2) it follows that: 
(3) K(e, 0, . . ., 0, a, b, . . .) = 0. 

This last equation means that, when v4, . . ., 0, all vanish, the relativistic 
law collapses into the classical one, which must therefore hold soora in this 
particular case.t 

There is a second way of sles 0,/c,..., Un fc tend to zero, cael by 
treating c as a variable parameter, fixing the velocities 3, ..., 0, and then 
letting c tend to infinity.? Putting c = Ify, we can write: 

R= Ry, Oy...) Um 4 b,...)and K = Koly, 01,..., Om a, b,...). 
We now require that: 

[Rly Dn - + +) Öm 4 b, .. .)—Koly, Oy, - -s Um a, 5,...)] 0, 
a8 c > 00, i.e. aB y = Ifc >o. 

1 Both Einstein and Planck assumed that Newton’s second law of motion holds good when 


the velocity vanishes. (Cf. below, p. 247.) 
* Note that, as ¢ -> œ, thè Lorentz transformation collapses into the Galilean one. 
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Assuming that R, 'and K, are continuous, we obtain: 


(3’) R(0, Un - - -» On, a, b, . . .) = Gy By - - Um a b,» )} 
One last way of meeting the requirement that R should tend to K as 
(01/6, . . . ,/c) tends to zero is to assume that R is a function of certain 


relativistic quantities and that K is the same function of the corresponding 
classical quantities. Then, if R and K are continuous and if each relativistic 
quantity tends to the corresponding classical one, it follows that: 


R > Kas (v/c, . . . Unde) > (0, ..., 0).? 


Having formulated the heuristic of the relativity programme in general 
terms, let me now give concrete examples illustrating the power of this 
heuristic. 

The first example is concerned with Planck’s modification of Newton’s 
second law of motion. Following Einstein, Planck considered a slowly 
accelerated electron in an inertial frame J. By substituting Lorentz’s 
expression for the ponderomotive force in Newton’s second law, it is found 
that the motion of the electron is governed by the equation 


(E+ 0 Ht) mia =o 
where e is the charge of the electron, @ is its velocity, a its acceleration and 


m its mass; as usual E and Hi are the electric and magnetic fields respec- 
tively. It is easily verified that this classical law is not Lorentz-covariant 


and thus has to be modified. Let us denote (Be Ë )—ma by 


K(o |e, a, E, Ë) Planck implicitly assumed that the new relativistic law 
would involve the same variables as the classical one. Thus let 
Riv |e, a, Ë, Ë) = 0 be the relativistic equation which is to replace 
K(© lc, a, E, H) = o. 

Consider the electron at the time ¢ when its velocity is 0, and choose an 
inertial frame J’ which moves with the same velocity 0 with respect to J. In 
J’ the electron is instantaneously at rest. If we denote by 0’, a’, E’ and H’ 


1 The relativistic law of the conservation of momentum 2m,v,/+/I—v}/e = 0 is a good 
illustration of equation (3’). 2m,0,/-\/1—o3/c! + Em; as c > œ Letting (Dy, © -e Dp) 
tend to zero serves no purpose in this case; since, if we start from an arbitrary function 
f(v) and consider Zf(v,)o,, then: as (0, ..., Ua) > (0,...,0), Zf(v,)v, + 0 = value of 
2Zm,0; for Ui = Uy =... = U, = O. This does not help us towards determining f. 

3 Denote by u, the relativistic mass m,/4/1—v?/c?. The relativistic momentum Envi and 
the classical momentum 21m,o, are the same functions of the masses and of the velocities. 
Lewis and Tolman (tacitly) assumed that u; > m; as 0; > 0. ch Lewis [1908] and Lewis 
.and Tolman [1909].) 
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the quantities in J’ which corréspond to 0, a, E and H respectively, then 
v’ = 0. By the Pelati Principle, z.e. by the equivalence (1') above: 
(4) [R |e, a, Ë, H) = 0] = [R'e a" BE, ÄN) = o) 

<> [R(o, a’, Ë, Ï’) = = 0] 

We have already explained that, for vanishing velocities, a relativistic 
equation must coincide with its classical counterpart. (This is a direct 
consequence of the requirement (2) above that the relativistic law should 
yield the classical one as a limiting case). 


Therefore: 
(5) [R(o, 4’, E’, H’) = 0] [K(o, a’, E’, H’) = o). 
But: 
(6) K(o, a’, È, H’) = (E+ A i’) — må = el! —mia’ 
By (4), (5) and (6) it follows that: 
(7) [R@(c, å, E, H) = 0] <» [eE -më = 0] 


Using known transformation equations, we can ii È and @ in 
terms of E, H, ò and 4, and thus obtain: 


DEES tee) AE) 


Thus 4 do wl a (B42 r i) is PAE equation af motion 
di\ /T—v8 Jè; 
for an electron moving in an electromagnetic field. Planck took the Lorentz 


force (2 AH it) to be the very paradigm of force and generalised the 





last equation as follows: 


(9) ae) = force =f 


Equation (9) is the relativistic law which replaced Newton’s second law 
of motion. By using (9), the expression of the relativistic kinetic energy k(v) 
can be determined. It is: , 


pa Ho) = [7-8 dt = mef 7-1) 


, Thus, by using Einstein’s heuristic together with the simple device of 
choosing an inertial frame J’ in which the electron is ipstantaneously at 
rest, Planck modified the law f = m@ which had been considered an un- 
shakeable convention ‘of theoretical physics. 5 
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Let us now examine how Einstein, by using the same heuristic, arrived 

at his famous equation relating-mass and energy: 
E = me#/,/1—v4/c* 

Einstein assumed that there had to be a relativistic law corresponding to 
the classical law of conservation of energy. By the Relativity Principle this 
new conservation law must hold- in all inertial frames. Einstein con- 
sidered an inertial frame J in which a stationary body B emits light and 
thereby loses a certain amount of energy Q. Since energy is conserved: 
(rz) E, = E,+Q 
where E, is the total energy of B before radiation and E, its total energy 
after radiation. 

Einstein considers a second inertial frame J’ moving with velocity —d 
with respect to J. The body B, which is at rest in J, moves with velocity 3 
relatively to J’ By the Relativity Principle, energy must be conserved both 
in J and in I’. Hence: 

(12) | E = E+’ 
where Ej, Eg and Q’ are the quantities in J’ which correspond to E, Ez 
and Q respectively. 

Subtracting (rr) from (x2): 

(13) Ei—E, = (E:—E,)+(Q’—Q). 

Einstein interpreted (Z{—,) as follows. E, is the energy of B in its 
rest-frame I-(before light is emitted). Ej is the energy of the same body B 
as seen from the moving frame J’. In J’ the body B moves with velocity 3 
but in Jit is at rest. Hence (Zj— E) is the energy which accrues to the body 
B solely in virtue of its motion; i.e. (H{—£,) is the kinetic energy of B to 
within an additive constant. By (zo) 

(x4) EE, = (Me jT) M+ h, 

where h is a constant and M is the rest mass of B before radiation. 
Similarly : 

(75) - BE, = (me? [4/1 —0* [c8)—mee+-h, 

wherd m is the rest mass of B after radiation. 

Note that Q is the energy lost through radiation in J. Einstein was in 
possession both of Maxwell’s equations and of the transformation laws for 
the field. From these he calculated that: 


(76) l QO! = QJ I= 
Substituting from (74)}—(16) into (x3): 
(27) (M—m)c? = Q; i.e. CAM = QO, where AM = M—m. 


~ 
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Thus the rest mass of B has decreased by the amount AM = ỌJë. A 
supposedly immutable substance, namely the rest mass AM, can vanish 
and thereby give rise to the equivalent amount of energy c?4M. This 
revolutionary result is a consequence of the Relativity Principle applied to 
the law of conservation of energy. True, Lorentz had shown that the electron 
possesses an electromagnetic inertia which varies with the speed. He had 
also found that the electromagnetic rest mass is a multiple of the electro- 
static energy. But neither in Lorentz’s [1904] nor in his [1g09] is there any 
indication that the rest mass is a variable quantity! 

The extraordinary power of the Relativity Principle is further 
displayed by the following fact. Given the Relativity Principle, the law 
of conservation of energy both implies and is implied by the law of 
conservation of momentum; where the momentum of a particle of rest 
mass m and velocity © is the vector md/./r1—v*/c®, and the energy of the 
particle is mc#/4/r—v*/c?. 

These examples show that the revolutionary relativistic laws were not 
arrived at in a sudden flash of intuition or through some kind of mystical 
insight. The new laws were mathematically derived from assumptions like the 
Relativity Principle which seem so ‘formal’ and innocuous as to be devoid of 
empirical content. 





3 Einstein's Programme Supersedes Lorentz’s. 


Einstein invented not a theory but a research programme with an 
immensely powerful heuristic. But research programmes are ultimately 
judged on their empirical rather than on their heuristic power. No matter 
how fruitful its heuristic guidelines for the construction of new theories are, 
the programme will not be successful if these theories are not empirically 
corroborated. In my view Einstein’s relativity programme superseded 
Lorentz’s in the empirical sense in 1915 with its explanation of the pre- 
cession of Mercury’s perihelion. This explanation requires the general 
theory. There were of course special relativistic results (e.g. E = mc?) 
which could in principle be tested, but even by 1915 such tests seemed to 
be only a remote possibility. 

My claim that Einstein’s programme superseded Lorentz’s with the 
explanation of the perihelion of Mercury raises two difficulties. First, since 
I wish to claim this as a success for the whole relativistic programme, I 
have to establish a continuity between the special and the general theories. 
Secondly, since the behaviour of Mercury was well-known, I shall have to 
show, in line with my definition of empirical support,? that the Mercury 
1 The rest mass of an electron is a function of the charge and of the radius. Lorentz took 

both the charge and the radius to be constant. 2 Cf. Part I, pp. ro1-4. ° 


S 
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prediction was an unexpected consequence of the general theory. It may 
seem that this preoccupation with Mercury’s perihelion is unnecessary in 
view of the fact that General Relativity made predictions which were novel 
in the temporal sense, e.g. the bending of light rays. However the Mercury 
prediction, in contradistinction to the bending of the light rays, was both 
in close agreement with observation and also depended on the full field 


equations? 


3.1 The Continuity between the Special and General Theories of Relativity. 


The explanation of gravitation by General Relativity appears to involve a 
major shift in the methods used by Einstein. But I propose to show that 
after 1908 Einstein merely strengthened the methods already used in Special 
Relativity?; the only new addition to the programme was in the heuristic 
function of the Principle of Equivalence (i.e. of the equality of gravitational 
and inertial masses).® 

One might think that the General Theory constitutes simply a generalisa- 
tion to the case of accelerating frames. But this is only a small part of the 


1 Cf. Adler, Bazin and Schiffer [1965], p. 194. 
* My views are in sharp contrast with those of Lanczos. In his [1972] Lanczos distinguishes 
between a young Einstein who was supposedly a strict empiricist and an older Einstein 
who indulged in speculation, Lanczos writes: [Einstein] was at that time still a convinced 
empiricist who would not have dared to argue that perhaps nature is based on rational 
and universal principles, which cannot be found by experimentation but only by inspired 
and imaginative speculation;...In fact Einstein in the beginning of his career dis- 
trusted mathematics and considered the mathematical formulation of a physical event as 
the mere form in which a phenomenon is described, which does not touch on its sub- 
stance. .. To think in experimental terms was Einstein’s basic attitude in the prime of his 
career in marked contrast to his ideas in the last phase of his life when in search for the 
ultimate unification of nature he often fell victim to mere formaliam.’ Against this view 
I maintain that the roots both of General Relativity and of the various unified field 
theories go back to 1905. Had Einstein really been ‘a strict empiricist and a follower of 
Mach who saw the task of theoretical physics purely in the more or less accurate descrip- 
tion of experimental observations’, then why did he insist that the Relativity Principle 
should hold not only at the observational but also at the highest theoretical level? An 
empiricist would have been perfectly satisfied with a solution such as Lorentz’s which 
explains why no experiment could detect absolute motion. To insist that the Relativity 
Principle obtains at the level of laws presupposes a realistic interpretation of these laws 
beyond their functions as mere tools for the description of experimental observations. 
For Einstein observational symmetries are nothing but the mere reflection of a deeper 
symmetry at the ontological level. The empirical success of Einstein’s earlier work and 
the empirical failure of Einstein’s later work (unified field theory) cannot be explained by 
Lanczos’s claim that Einstein degenerated from empiricism to speculation. High level 
speculation paid off handsomely in the case of General Relativity Theory; the programme 
achieved stunning empirical succeas in 1915 and in 1919. It is hardly surprising that 
Einstein ‘overestimated’ the speculative (or so-called aprioristic) method which enabled 
him to construct the General Theory and tried to apply the same methods to the con- 
struction of unified field theories. He was unlucky, but the lack of—-empirical—success 
cannot be attributed, as Lanczos claims, to a change in methodological attitude. The same 
- attitude helped Einstein in the case of the General Theory, but it failed him later in life. 
Einstein proposed and Nature disposed. 
3-For a more detailed account see Zahar [1973]. 
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answer to the continuity problem. After 1905 Einstein’s main problem was 
to devise a Relativistic Theory of Gravitation. He found it impossible to 
reconcile his equation! : 
l E = mê = mJy I =] 

with the Principle of Equivalence. This Principle implies that, since the 
gravitational and the inertial masses of any material body are equal, all 
bodies fall with the same acceleration in the same gravitational field g. In 
classical physics this result follows from Newton’s second law 


f= (ma) = = ma 


and from the equality of the inertial and gravitational r masses. The masses 
cancel out on both sides of ma = mg, leaving à= g. Einstein assumed 
that in Special Relativity the corresponding equation would be some rela- 
tion of the form: 

(inertial mass) g= = rate of change of momentum; 


R s mog | VI— P] h = ii © (mb / VTA), 
In view of the equation: 
E = energy = myc*/4/1—v*/c?, 
the rest mass ma may vary with the time; in other words we may have 


(dmy|dt) + 0. 
Dividing through by m,, we obtain on the right-hand side of: 


mg VIE =F (mlv Te) 


a term 
I dmg Ù 
my dt y1 oå 
which may be different from zero. 

Thus the motion of a material body under the effect of the gravitational 
field will generally depend on its rest mass mo. The Principle of Equivalence 
is violated. 

Rather than give up the Principle of Equivalence, Einstein gave up the 
hope of giving a special relativistic theory of gravitation. He changed his 
tactics and launched a two-pronged attack on the problem of gravitation: 

(z) Einstein only now remembered his Machian scruples concerning the _ 


* Warning to the reader: In this section mp denotes the rest mass and m the inertial mass 
Le. m = mo] V 1—0]. 
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so-called myth of the inertial frame, If one purges Newtonian Theory of 
the Absolute Space Hypothesis, one finds that Special Relativity is as 
‘absolute’ as classical mechanics; both theories postulate a set of privileged 
inertial frames. From a Machian viewpoint this assumption is unacceptable.? 
Einstein decided to treat all coordinate systems on a par and to impose a 
condition of general covariance on all physical laws. This condition, which 
is a strengthening of the requirement of Lorentz covariance (General 
Covariance of course implies Lorentz covariance), is an important element 
of continuity between the special and the general theories of Relativity. 


(2) Einstein decided to go back to his original heuristic, in particular to 
the heuristic device which consists in scrutinizing known empirical results 
and in isolating certain features in them which are ‘unsatisfactorily’ 
explained by current theories.? As I have shown, Einstein analysed the well- 
known ‘fact’ that all bodies fall with the same acceleration in the same way 
that he had analysed the result of the induction experiment.* Let me recall 
that Einstein reached the conclusion that all gravitational fields can be 
regarded as caused by a local acceleration of the frame of reference. It is 
thus obvious why the introduction of accelerated frames holds out the 
promise of solving the problem of gravitation. 

The two prongs of the attack can now be seen to converge to the same 
result. We have here a second element of continuity between the Special 
and the General Theories: each involves an application of Prescription II. 
However, Einstein now faced a new difficulty. In Newtonian mechanics 
there exist so-called inertial fields which arise if one chooses an accelerat- 
ing frame of reference. These inertial fields are artificial in that they can be 
transformed away by one global change of coordinates, namely by a change 
which refers everything back to an inertial frame. This is not the case with 
‘real’ gravitational fields; for example the field at a point near the earth’s sur- 
face can be transformed away only at the cost of piling it up at the antipode. 
How could one ever hope to be able to deal with two such dissimilar fields 
in the same way? By a tremendous stroke of genius Einstein turned this 
seeming impasse into a powerful heuristic device. Consider an accelerating 
frame of reference S in which there exists both a ‘real’ gravitational field 2 
and an inertial field 7. Every particle P is acted upon by a force 
F= mi+m,g. Since m; = inertial mass = gravitational mass = m, = 
m(say), it follows that F= mitmg = m(i-+-g) = mG, where G = ite. 


1 This may have been a reason for Mach’s rejection of Special Relativity. (Cf. Mach 


. [1913], Preface.) 
* This is Prescription I; cf. above, p. 225. The heuristic of Special Relativity is only part of 


* Einstein’s general heuristic as expressed in Prescriptions I and II. 
Cf. above, pp. 225-7. 
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Thus i and g always occur indissolubly fused into one global field G; 
Einstein typically refuses to consider this ‘fact’ a mere accident and argues 
that there exists only one total field G which a new theory of gravitation 
would have to explain. In view of G = i-+g,i is now a special case of an 
Einsteinian gravitational field; G reduces to 7 if all matter in the universe is 
either annihilated or removed to an infinitely distant point. The field ican 
be globally transformed away through a single change of coordinates; in 
other words 7 is a reducible gravitational field. Reducible fields offer the 
advantage that they can be generated at will through an arbitrary accelera- 
tion of the frame of reference. One can heuristically exploit the Principle of 
Equivalence by generating a reducible ; field i through an acceleration of the 
frame, by studying the properties of i and finally by extending these peo: 
perties to non-reducible fields G. 

But how is this generalisation to be carried out? This is where the 
absolute differential calculus proved extremely helpful. The method is as 
follows: start from an inertial frame S(2, #1, #%, 2) in which Special 
Relativity applies in its usual form; hence: 

= (dP) — (1E) — (23°) — (di)? = Fn dE" da 
(we take the velocity of light to be unity) where: 


I 
i +a O 
(E mn) m =f 
Oe 24 
Accelerate the frame S; in other words carry out a non-linear transforma- 
tion of coordinates. In the accelerating frame S(x, x1, x2, x$): ds? = 
Emn dx™ dx”, where (Zmn) varies from one point to the next. The matrix 
(Zmn) is not arbitrary, since it satisfies the following condition which we 
denote by K: through a global transformation of coordinates, namely 
through the transformation (x™) -> (#*), (gmn) is reducible to the constant 
matrix (Emn). There exists in S a reducible gravitational field generated by 
the acceleration of the frame S with respect to S. We study the behaviour 
of this field in S, then we generalise our results by abstracting from, t.e. by 
lifting, the condition K. In doing this we have to study the same processes in 
two different frames S and S; so we need a method for translating the 
results obtained in S into results applying in S; the absolute differential 
calculus provides such a method. 
Using such methods, Einstein determined the path of a particle moving © 

freely in a gravitational field. In the frame S the trajectory of the particle is 
a straight line whose equations are: d4x*/(dx°)* = o(1 = o, I, 2, 3). Since 
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our aim is to ‘look’ at the same particle from the accelerated frame S, it 
proves useful to characterise the path in an invariant way, t.e. in a way 
which does not depend on any particular coordinate system. Using the 
intrinsic parameter s instead of £° (i.e. t), we obtain: 


(z) d? |d? = o 


These equations are easily seen to give the integral | ds a stationary value. 
In other words: 


(2) a( | as) = o when dds? = o in Š. 


_ In view of the fact that ds is an invariant, this last equation means that, in 
S and in any other frame of reference, the trajectory of the particle is a 
geodesic. From the absolute differential Calculus we know that a geodesic 
in S satisfies the following equations: 

da f i) dxm dx” 
(3) Feel ae gee 


where 


eae? I iuf Emu Gnu Bmn 

{iat = be woe (4 + Gym Bg 
Comparing (1) and (3), we conclude that in the accelerating frame S the 
path of the particles is no longer a ‘straight’ line in the ordinary sense of the 
word: the quantity which deflects the particle from a straight trajectory is 


the quantity represented by the Christoffel symbol Aa Since Me is a 


function of the g,,’s and since we ascribe the deviation from a straight path 
to the action of a gravitational field, the latter is represented by the g,,’s or 


rather by the partial derivatives dg,,/@x™; note that the coefficients a 


vanish if the g,,’3 are constant. The g,,’s can therefore be looked upon as 
the gravitational potentials. 

So far we have implicitly assumed that the gravitational field is reducible, 
i.e. ‘inertial’ in the old terminology. We now generalise our results by 
extending them to the case where there need not exist a global transforma- 
tion which makes all the g,,’s constant. Hence, even if the field is irreducible, 
the path of a free particle is still a geodesic and the metric tensor (g,,) still 
represents the gravitational potential. 

_ This method also enabled Einstein to determine the effect of gravitation 
on other physical phenomena. He wrote the laws governing these 
phenomena in a generally covariant form, which generally involves the 


U 
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£4’; he then abstracted from the condition K which was originally imposed 
on the metric tensor. The presence of gravitation manifests itself through 
the irreducible g,,’s which occur in the new law. 

There remained for Einstein the problem of finding the field equations 

which are satisfied by the g,,’s. His attack on the problem was again two- 
pronged?: 
(z) The new law would have to yield Poisson’s equation 7*¢ = kp as a 
limiting case. This requirement is identical with the one made in the case of 
Special Relativistic laws. Because of Poisson’s equation Einstein expected 
his own law to consist of second-order partial differential equations linear 
in the second derivatives d%g,,/dx™dx*. 


(2) We have seen that g,,’3 have a dual function: on the one hand they 
represent the physical gravitational potentials and on the other they are 
coefficients in the expression of ds*; this second function is a geometrical 
one. It can be said that Einstein geometrised gravitation or alternatively 
that he physicalised geometry. Since the field equations describe a geomet- 
rical state of affairs, they ought to be independent of any particular frame 
of reference; t.e. they ought to be generally covariant. 


Thus Einstein and Grossmann? started from the rather vague assumption 
that the gravitational field is a geometrical entity (‘geometry’ is to be under- 
stood as synonymous with ‘kinematics’ or ‘space-time geometry’). It had of 
course long been known that gravity was caused by the presence of massive 
bodies. Also, by the Special Relativistic equation E = mc?, inertial mass 
and energy are interchangeable. Finally, by the Principle of Equivalence, 
inertial mass and gravitational mass are identical (wesensgleich). Putting all 
these assumptions together, Einstein guessed that gravitation is a geomet- 
rical phenomena related to the energy content of space. Grossmann 
expected the field equations to be of the form A = B, where A and B 
respectively represent the geometry and the energy content of space. He 
took for granted that the geometry in question is Riemannian and not some 
more general, 7.e. less structured, geometry. In the case of free space, B 
vanishes, so we are left with the equation A = o. In order to determine A, 
Grossmann considered the tensor B7 „y which Riemann and Christoffel had 
shown to be essentially relevant to the geometrical properties of space. 
Grossmann knew that the equations B‘,,, = 0 imply that the space is flat 
and hence that the field can be globally transformed away. He also knew 
that the gravitational field is generally irreducible. So Grossmann weakened 
the relations B?,,,== 0 by using a standard mathematical technique, 


1 For,a more detailed account see Zahar [1973]. 
2 Einstein and Grossmann [1913]. 
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namely the contraction of a contravariant index with a covariant one. Note 

‘that there exists essentially one way of contracting two such indices in 
Bi ipv, because — Bisu = Bl ipo and B’,,, vanishes identically. By using 
this type of reasoning , Grossmann finally obtained the equations B? 3ps = 
Ri, = 0, which are accepted today as the correct field equations for free 
space. Thus, through giving a precise mathematical formulation of his 
initial assumption that gravitation is a geometrical phenomenon linked to 
the energy content of space, Grossman had obtained the much stronger 
proposition: R,, = 0. In fact, as I have already said above, Grossmann’s 
initial assumption was so weak as not even to imply that the geometry to be 
used is Riemannian. The reason for resorting to Riemannian geometry was 
its availability at the time as a fully developed mathematical system. This 
illustrates the point made earlier about the first heuristic role of mathe- 
matics in physical discovery.? 

The solution R,, = o for free space was rejected by its authors for two 
reasons, both of which turned out to be unfounded. First, Grossmann 
believed that R = o would not yield the classical equation 7*f = 0 as a 
limiting case for weak static fields. This was a relatively simple mathematical 
error. Secondly, both Einstein and Grossmann thought that, given the 
appropriate boundary conditions, the ten equations R; = o would uniquely 
determine the ten functions g,;. This means that we are not at liberty to 
choose an arbitrary frame of reference because the functions g, are 
generally altered by a change of coordinates. Thus it seems that the Rela- 
tivity Principle is violated. Hilbert saved the situation by showing that the 
equations R, == o are not all independent; the left hand sides satisfy 
four identities, which give the exact degree of arbitrariness necessary for 
the free choice of a frame of reference (four identities corresponding to four 
coordinates).® 


3.2 The Successful Explanation of the Perihelion of Mercury and its Role in 
the Further Development of the General Theory. 


In 1915 Einstein went back to the equations R; = o for free space; for the 
case where non-gravitational energy in the form of a symmetric tensor Ty 
is present, Einstein found it natural to generalise the equations R; = o to 
Ra = kT,,;. This generalisation turned out to be untenable. However, 
using only the equations R; = o for the field created by the sun, Einstein 
explained the precession of Mercury’s perihelion. He published this result 
on the 22nd November 1915. This explanation of a well-known fact was 
_tremendously important for the following reasons: the predicted fact is 


1Cf. Part I, pp. rog—11. 
2 Cf. Einstein [1915¢]. 
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completely novel in the sense that I have previously explained}; that is 
Einstein did not use the known behaviour of Mercury’s perihelion in con- 
structing his theory. In fact this empirical prediction is all the more dramatic 
because it flows from a hypothesis which is so speculative, so ‘metaphysical’, 
that one may wonder whether it belongs to physics or to pure mathematics. 
Thus, through explaining the ‘anomalous’ motion of Mercury's perihelion, 
Relativity Theory superseded its rivals from a strictly empirical point of 
view.? This empirical success also proved crucial for the further development of 
the Relativity Programme. 

Einstein realised that the equations Ry = kTy are untenable because 
the right hand side is divergenceless whereas the left hand side is not. 
Other things being equal, it would have been natural for Einstein to aban- 
don this whole approach, t.e. to reject both the general equations Ry = kT y 
and the special case R, = o. The fact that the equations R, = o had 
enabled him to explain the motion of Mercury convinced him that the fault 
lay not with his overall approach but with the method of generalising the 
equations R; = 0. In other words Einstein kept the equations Ry = 0 
for free space and looked for a new method of generalising these relations.® 

At this critical stage Einstein was helped by the mathematical machinery 
of his system and by the Special relativistic law about the interchange- 
ability of inertial mass and energy. By using variational methods, he 
extracted from the relations R,, = 0 a matrix t which obeys a formal con- 
servation law: tj), = 0, ie. dt{/dx/ = o. However, ti is not a tensor and 
hence appears not to be susceptible of any physical interpretation. It 
seemed as if t{ ought to be treated as a mere mathematical entity which may 
be used for purposes of convenience but is otherwise devoid of physical 
meaning. Realising that he had reached an impasse with the equations 
R = kT,,, Einstein insisted against all odds on interpreting #{ as a 
gravitational energy matrix; but energy represents inertial mass and hence 
also gravitational mass (Principle of Equivalence); thus we reach the sur- 
prising physical result that gravitational energy acts as one source of the 
gravitational field. In passing from R; = 0 to Ry = kT,, Einstein had 
‘mistakenly’ supposed that he was going from a case in which energy was 
totally absent to a case in which it was not. He had forgotten that even when 
R, = 0, gravitational energy may be present. His solution consisted in 
1 Cf. Part [, pp. 101-4 
® Lorentz in his [1900] had produced a theory of gravitation which, however, explained 

only a amall fraction (one tenth) of the residual angle of precession of Mercury’s peri- 
ae General Theory’s prediction of Mercury’s behaviour not, been novel, e.g. had 
Einstein ‘adjusted parameters’ in order to obtain the correct experimental results, he - 


would surely not have had such confidence in the equations R,, = 0, in the face of the 
breakdown of the more general form Ry = kT 
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rewriting the equation R, = 0 so as to bring out this dependence of the 

field on its own energy and then generalising by adding the non-gravita- 

tional energy kT} to the gravitational energy tį. In other words Einstein 

rewrote Ry = 0 in the form F(#, t) = 0, where: t = ff; he then gener- 

alised these equations by adding 4T] to tį and AT to #. Einstein obtained 
F@+RT], t+kT) = 0 

which turned out to be equivalent to the currently accepted field equations: 


Ry =~ Ty Leu) 


This illustrates what I called the second role of mathematics in physical 
discovery. 


Let me end by reviewing the assumptions and methods which are com- 
mon to the Special and to the General Theories of Relativity. These com- 
mon assumptions and methods are the bridges connecting the two stages of 
the programme. Let me make the obvious point that in both theories 
Special Relativity holds locally, i.e. it holds in infinitely small domains 
about each point. Einstein starts the two theories by analysing two well- 
known ‘facts’ from the same point of view: he analyses the result of the 
induction experiment and the ‘fact’ that all bodies fall with the same 
acceleration. Common to both theories is the law concerning the inter- 
changeability of mass and energy. The equation E = mc? was a dramatic 
new result implied by Special Relativity; moreover it was precisely this 
result which led Einstein to transcend the Special Theory and resort to 
General Relativity as a framework which would embody gravitation. We 
have also seen that the law about the interchangeability of mass and energy 
played a crucial role in enabling Einstein to modify his field equations 
Ry = +kT. Both the Special and the General Theories make use of the 
Covariance Principle: Lorentz-covariance in the case of Special Relativity 
and general covariance in the case of General Relativity. In both stages of the 
programme scientists exploited the assumption that classical theories ought 
to be limiting cases of the new relativistic laws. In Special Relativity the 
law of inertia, Maxwell’s equations, Newton’s second law and the laws of 
the conservation of energy and momentum were used in order to determine 
new and different Lorentz-covariant equations. In General Relativity 
Poisson’s equation was exploited: since Poisson’s equation was to bea limit- 
ing case of the law of gravitation, the latter was expected to consist of a 
system of second order partial differential equations which would be linear 
.in the second order derivatives. The only essentially new method peculiar 


1 Cf. Part I, pp. 109-11. 
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to General Relativity was the heuristic use which was made of the Principle 
of Equivalence and which has no analogue in the Special Theory. 

Thus the question asked in the title of this paper, ‘Why did Einstein’s 
Programme supersede Lorentz’s?’, has now been answered. Already in 1905 
the Relativity programme proved heuristically superior to its classical rival; 
at a time when the notion of a quasi-material ether was becoming heuristic- 
ally barren, Einstein provided a powerful new tool for the construction of 
Lorentz-covariant laws yielding the corresponding classical theories as 
limiting cases. However, heuristic power gives a measure only of intellectual 
achievement and not of scientific progress. After all, science is empirical. 
Special Relativity by itself did not empirically supersede Lorentz’s pro- 
gramme. Bucherer’s experiment! confirmed both Lorentz’s and Einstein’s 
hypotheses and Kaufmann’s experiment? disconfirmed them both. Indeed, 
before the advent of General Relativity the scientific community (e.g. 
Planck, Poincaré, Bucherer, Kaufmann and Ritz) spoke of the Lorentz- 
Einstein theory and contrasted it with the more classical theories of 
Abraham and Ritz: they regarded the theories of Lorentz and Einstein as 
observationally equivalent.® 

It was only when Einstein’s programme yielded General Relativity that 
it superseded Lorentz’s empirically by successfully explaining the 
‘anomalous’ precession of Mercury’s perihelion.* This explanation con- 
stitutes empirical progress because, according to my amended definition of 
‘novel fact’, the behaviour of Mercury, although well-known, is nonetheless 
a novel fact predicted by General Relativity. 

This new (General Relativistic) phase in which empirical success was 
achieved was, as it happened, more speculative than the previous (Special 
Relativistic) phase. In this later phase Einstein strengthened his earlier heur- 
istic and thus arrived at a covariant theory of gravitation (he had been unable 
to accommodate gravitation within the confines of Special Relativity’). 

Nevertheless there is a strong continuity between the Special Theory 
and the General Theory. The latter can be regarded as a more powerful 
realisation of essentially the same outlook and the same heuristic which had 
previously led Einstein to Special Relativity. During the earlier phase the 
deep differences between Lorentz and Einstein remained primarily heuristic 
(and of course metaphysical) ones. It was only with the development of the 
General Theory that the underlying conflict between the two programmes 
was reflected at the empirical level: with regard to Mercury’s perihelion, the 
bending of the light rays and the red shift, General Relativity made pre- 
dictions which were never matched by Lorentz’s (or Ritz’s) theories. 


1 C£. Bucherer [r909]. . 7Cf. Kaufmann [1905]. 3 Cf. Ehrenfest [1913], p. 321. 
4 Cf. Einstein [1915¢]. 5 Cf, Einstein [1912]. 
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Discussions 


COMMENTS ON ‘THE INCOMPATIBILITY OF MACH’S 
PRINCIPLE AND THE PRINCIPLE OF EQUIVALENCE IN 
CURRENT GRAVITATION THEORY’ 


I am not sure of the correctness of the Gedankenexperiment designed by 
Woodward and Yourgrau [1972] to prove the claim that the inertial field for one 
body alone in the universe must be null. They state that a body with a finite 
mass density isolated from the ‘influence of all physical fields (inertial, gravita- 
tional, electromagnetic, etc.) produced by all of the external matter in the 
universe’ will, under Mach’s principle, have no inertia. Pauli [1958] states the 
principle in a form slightly different from that of Woodward and Yourgrau. He 
says ‘it has to be postulated, in particular, that the inertia of matter is solely 
determined by the surrounding masses. It must therefore vanish when all other 
masses are removed, because it is meaningless, from a relativistic point of view, 
to talk of a resistance against absolute accelerations (relativity of inertia).’! 

At first glance one might conclude that Pauli agrees with Woodward and 
Yourgrau. Yet I believe Pauli intended his statement of Mach’s principle to be 
applicable for a single elementary point particle only. Any extended body is 
comprised of atoms and electrons. Even the simplest atom, hydrogen, consists of 
a proton and an electron, quite apart from considerations of the electromagnetic 
field. Woodward and Yourgrau have postulated a body with a finite mass density, 
hence one which occupies a finite volume and must be looked on as an extended 
structure. If we consider one electron in that structure, is it not reasonable to 
attribute that electron’s inertia to the physical fields (inertial, gravitational, 
electromagnetic) arising from those other nuclei and electrons of which the body 
consists—even though the body be isolated from all external influences? I feel 
that the surrounding masses whether they are far removed or a part of the body 
itself determine the inertial properties. If the only surrounding masses are those 
which are part of the body, then these alone will determine the inertia of the 
mass element they enclose. If one considers a universe containing one hydrogen 
atom only, then Mach’s principle states that the inertia of the electron is deter- 
mined by the proton alone and the reverse. 

Even an elementary particle has a finite mass density and consequently must 
be considered to have extension. Therefore each element of matter in the body 
would owe its inertia to the rest of the body, were there no other matter in the 
universe. Sachs [1972] has made a related point in suggesting that the strength 
of coupling between the stars and an electron observed on earth could be so 
weak that one could neglect its contribution to the electron’s inertia. He then 
argues that the immediate neighbourhood of theelectron,#.e. the physical vacuum 
containing electron-positron pairs and electromagnetic radiation, would be 
sufficient to maintain the validity of Mach’s principle. This is not complete 
isolation, of course, for it implies a vacuum with energy. 


1 Pauli [1958], p. 179. 
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It should be pointed out as a footnote that Mach himself did not believe in 
atoms, a fact which influenced the form he gave to his principle. In the English 
edition of his ‘Science of Mechanics’-—Mach [1893]—he says ‘Atoms cannot be 
perceived by the senses; like all substances, they are things of thought. Further- 
more, the atoms are invested with properties that absolutely contradict the 
attributes hitherto observed in bodies.... The atomic theory plays a part in 
physics similar to that of certain auxiliary concepts in mathematics; it is a mathe- 
matical model for facilitating the mental reproduction of facts.’ 

The argument of Woodward and Yourgrau becomes valid for an isolated 
point particle only. But this implies the possibility of an infinite mass density 
with all its attendant problems. Therefore one would conclude that only in the 
complete absence of matter would a space be free of inertial properties. This is 
obviously compatible with the equivalence principle for any test of equivalence 
would necessitate the introduction of matter in the form of an extended body. 


RONALD G. NEWBURGH 
Air Force Cambridge Research Laboratories 
Bedford, Massachusetts 
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MACB’S PRINCIPLE, THE EQUIVALENCE PRINCIPLE AND 
GRAVITATION: A REJOINDER TO NEWBURGH 


x Introduction - 


In this issue of this journal Newburgh has raised a point of considerable interest 
and importance with regard to our claim that Mach’s Principle and the Principle 
of Equivalence are incompatible under current, general relativistic, gravitational 
field theory.” 3 It is not,clear to us whether or not Newburgh subscribes with 
Sachs‘ to the view that the bulk of the inertia of an object is created locally by 
vacuum polarisation effects—a view that is almost undoubtedly false in view of the 
results obtained through mass and charge renormalisation in quantum electro- 
dynamics.® This issue notwithstanding, Newburgh’s comments with respect to 
the self inertia of an extended, neutral mass distribution of finite radius are, 
nevertheless, well taken. 


1 Mach [1883], pp. 588-9. Page reference to the sixth American edition, translated byT. J. 
McCormack. 

2 Cf. Newburgh [1973]. 

+ 3 Cf. Woodward and Yourgrau [1972]. 

* Cf. Sachs [1972]. A similar argument has also been made recently by Tomozawa in his 
+ [1972]. 5 Cf. Sakurai [1967], pp- 271 and 249. 
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Newburgh’s criticism of our conjecture appears to stem from the fact that our 
definition of the necessary and sufficient conditions that a theory of gravitation/ 
inertia must satisfy to be Machian is not rigorously formulated. To clarify our 
position on the physical content of Mach’s Principle let us mention the most well- 
known interpretations of that principle. Mach himself, of course merely claimed. 
that the inertial properties of bodies were due to the presence of other matter in 
the universe; in other words, it is nonsensical to talk about the state of motion of 
a single, point mass in an empty universe. The attempts to delineate the physic- 
ally significant content of this observation have been many and varied, but the 
common feature of almost all of these endeavours has been the identification of 
the principal part of the inertial properties of matter with the gravitational 
properties of matter. 


2 Conditions for a Machian Theory of Inertia and Gravitation. 
We now briefly indicate what we consider to be the most important of the various 
criteria that have been suggested as embodying the physical essence of Mach’s 
Principle. 
I The Einstein criteria: 
A The inertia of a body must increase when ponderable masses are piled up 
in its neighbourhood. 


B A body must experience an accelerating force when neighbouring masses 
are accelerated, and, in fact, the force must be in the same direction as that 
acceleration. 


C A rotating hollow body must generate inside of itself a ‘Coriolis field’, 
which deflects moving bodies in the sense of the rotation, and a radial 
centrifugal field as well. 


D The gravitational/inertial field equations must yield no solution for an 
empty universe. : 


2 The Hénl criterion*: The role of Mach’s Principle is to act as a selection prin- 
ciple in the determination of physical and non-physical global solutions of the 
gravitational /inertial field equations. 

3 The Sciama criterion®: The inertial properties of a small, neutral body are 
almost totally induced by the gravitational interaction of the remainder of the 
matter in the universe with the body; and that the inertia of a body is a reaction 
against accelerations of the body relative to the background gravitational field 
produced by the rest of the matter in the universe. 


4 The Pauli criterion*: The inertia, and thus the gravitational field, of a single 
body in an otherwise empty universe must be null. 


In our opinion all of these criteria, with the possible exception of r D, must be 
met if a theory of gravitation/inertia is to be truly Machian. Above all, the Pauli 


1 Cf. Einstein [1956], p. roo and North [1965], pp. 83-92. 

3 Cf. Hénl [1953]. This idea has been more recently advocated, modified and developed by 
Hönl and Dehnen [1963], Wheeler [1964], Sciama, Waylen and Gilman [1969], and the’ 
recent work of Raine [1973a] and [19736] among others. 3 Cf. Sciama [1953]. 

4 Cf. Pauli [1958]. Sciama [1953] also touched upon this point. : 
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criterion, perhaps the most stringent of the various restrictions on the field 
equations, must be satisfied. This criterion, as stated by Pauli (and previously by 
us),/ is not sufficiently rigorous, and it is the imprecision of its statement that has 
given rise, in our opinion, to Newburgh’s criticisms. We, therefore, now give a 
strict restatement of the Pauli criterion: 


Given a neutral body of finite mass density and radtus alone in a untverse, for 
the gravitational inertial field equations to be consistent with Mach’s Principle 
they must satisfy the condition that in the limit as the radius of the body goes to 
zero, the magnitude of the field goes everywhere (except at the matter 
singularity) to zero. That ts, 

limit m = 0, 


r—>o 


Jor an external (massless) observer. 

If this stipulation is met, a body of finite radius will have a self-inertial mass as 
Newburgh correctly claims it should, but it will be exceedingly small in compari- 
son with the inertia that would be induced in the body by the external matter in a 
universe like ours. On the other hand, the intent of the Pauli criterion is clearly 
consonant with the above definition. One should note, however, that the Pauli 
criterion, as stated above, requires that the definition of density (mass/volume) be 
somewhat more complex than the usual notion of this property. Where in the 
absence of this criterion we have but one density concept, we are now obligated 
to introduce two types of density, namely (z) the true density—the ratio of the 
unobservable ‘actual’ mass of the body to its volume (which still becomes 
infinite as the radius of the body goes to zero); and (2) the observed density, i.e. 
the density measured by an external observer which, in the limit as the radius of 
the body goes to zero, must satisfy the Pauli criterion by becoming zero. This 
implies that a ‘source strength suppression’ effect for the gravitational field, 
perhaps such as that found in Treder’s tetrad theory of gravitation,® is a necessary 
consequence of the Pauli criterion. 


3 Discussion. 

At the outset, we note that general relativity fails to accord with the Pauli criter- 
ion. The Schwarzschild metric does not asymptotically approach the Minkowski 
metric everywhere as the radius of a single field source is contracted to zero. We 
may thus conclude that Mach’s Principle and general relativity are incompatible.® 
But what of the Principle of Equivalence? 


1 Cf. Woodward and Yourgrau [1972]. 

2 Cf. Treder [1971]. 

3 Essentially the same conclusion can be reached by another line of reasoning. When one 
considers the solution of the general relativistic field equations for a sphere of small mass, 
the boundary condition that space-time be asymptotically flat at infinity must obtain. 
(This condition is required by the assumption of infinitesimal Lorentz invariance in 
general relativity theory.) But Raine [1973a] has shown that the supposition of criteria for 
Machianess equivalent to our criteria 2 and 3 above excludes all asymptotically flat 
solutions of the field equations as unphysical. Field equations that forbid the discussioh 
of a single particle and thus the consonance of the solution of the equations therefor with 

* Mach’s Principle (as is necessary for investigations of the accordance of the equations 
with the Pauli criterion) are, in our opinion, unacceptable eyen if they admit of other 

* Machian solutions. 
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Employing a Gedankenexperiment analogous to that utilised in our previous 
paper, and demanding the conservation of energy, it follows that our weak form 
(designated as strong by others) of the Equivalence Principle is valid. If the 
inertial, passive gravitational and active gravitational masses of a body in our 
universe, which is isolated from all external fields, do not all approach zero in 
constant proportion as the radius of the body is contracted to zero, then the 
conservation of energy and the law of action and reaction are both violated. Thus 
the equivalence of the various types of mass, but not the equivalence of gravita- 
tional fields and accelerated reference frames, is compatible with, and deducible 
from, Mach’s Principle if energy conservation is granted. 

We must emphasise at this point that the above arguments apply strictly only 
to totally neutral matter (matter consisting solely of elementary particles with no 
charge and no magnetic moment). The role of charge and the electromagnetic 
field in gravitational theory and under Mach’s Principle seems to us more com- 
plex than is presently believed. For example, note that the chief difference be- 
tween the Einstein form of the Equivalence Principle (which we designate as the 
strong form of that principle) and the form of this principle above demonstrated 
to be compatible with Mach’s Principle is the way in which the electromagnetic 
field is treated—+.e. the demand of infinitesimal Lorentz invariance and the way 
in which the electromagnetic field energies are included in the gravitational field 
source tensor which permits one to assert the equivalence between a gravitational 
field and an accelerated frame of reference. This naturally leads to the conjecture 
that the failure of relativity theory to subsume Mach’s Principle is, at least in 
part, attributable to the way in which electromagnetism is handled in general 
relativity. 

Even if the field interaction difficulties just alluded to are ignored, other 
problems arise when one tries to stipulate the contribution to the inertial mass of 
an elementary particle due to its charge. One encounters immediately the pro- 
found difficulties involved in mass and charge renormalisation in relativistic 
quantum electrodynamics. Physicists have been able, so far, to avoid quantitative 
defects in the theory by employing the ad hoc device of ‘cut-offs’. But, since a 
plausible physical theory with respect to which the cut-off procedure may be 
viewed as an approximation is wanting, the artifice of cut-offs must be, and gener- 
ally is, held to be inherently unacceptable. Unfortunately, the electromagnetic 
inertial mass of a charged elementary particle depends logarithmically on the 
value of the cut-off—a magnitude presently unknown and believed to be 
theoretically undeducible. 

It turns out, curiously, that the cut-off dilemma is amenable to solution and 
can be eliminated if the following two assumptions are admitted: 


T The bare charge of an elementary particle may only emit and absorb virtual 
photons (of the Coulomb field) with energies that are integral multiples of 
some fundamental energy, and the bare charge acts as a perfect emitter and 
absorber of these quanta. 
2 The observed charge of an elementary particle is independent of the 
‘radius’ of its bare charge. 

The first postulate is but the natural extension of Planck’s hypothesis to the 


*1 Cf, Woodward and Yourgrau [1972]. 
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emission and absorption of virtual photons by charge, and the second merely a 
slight modification and extension of the principle of charge conservation. Neither 
is particularly implausible. Nevertheless, it can be shown that these suppositions 
permit one to express the cut-off purely in terms of physically observable quanti- 
ties and lead to the conclusion that both the cut-off for, and electromagnetic mass 
of, elementary charged particles are finite, particularly in the limit of zero radius. 
However, since the demonstration of this conclusion is somewhat involved, we 
leave a detailed discussion of it to a forthcoming article, restricting ourselves 
here to the observation that the electromagnetic contribution to the inertia of a 
body composed of physically realisable particles is, at best, exceedingly small. 


4 Conclusion. 
In conclusion we would like to mention two types of gravitational theories that, 
for neutral matter, may be modifiable to a strictly Machian form. Both are non- 
metric, tensor theories. The first type is the class of tetrad theories investigated 
by Ogievetsky and Polubarinov! in which the gravitation is assumed to possess 
a non-zero rest mass. These theories are of interest because, while our weak form 
of the Equivalence Principle is preserved, the Einstein form of the Equivalence 
Principle is broken. Thus, one of the apparent causes of the incompatibility of 
general relativity and Mach’s Principle is removed. 

The other theory of interest is the tetrad theory of Treder.? In this theory the 
field equations assume the form, 

hA+GhdT* = 0, 
where the h’s are the components of the tetrad field, G the coupling constant, 
with, 
Tey = T4837, 

where Ty is the mixed source tensor and the field is constructed in a global 
Minkowski space-time. This theory has the attractive feature that the active 
gravitational mass of a body is a function of its radius and the gravitational field 
strength due to external sources. One of the unacceptable properties of the theory, 
from the Machian point of view, is that the equivalence between active and pass- 
ive gravitational mass is broken. However, the difficulties of these theories do not 
seem insurmountable, and the construction of a rigorously Machian theory of 
gravitation with electromagnetism properly accounted for will, hopefully, prove 
possible. 


Postscript 


It might be historically not without interest that Einstein himself, after 1930, 
was definitely of the opinion that General Relativity Theory and Mach’s Principle 
were at the very least not dependent upon each other and, in all probability, in- 
consistent with each other. According to Bergmann,’ Einstein never changed his 
opinion on this issue. The spate of unified field theories which was the fashion 
in the 30’s until 1950 or so, has essentially faded into oblivion, in spite of occa; 
sional papers by scholars who continued Einstein’s dream. At present, it seems 


1 Cf. Ogievetsky and Polubarinov [1965]. 3 Cf. Treder [1971]: 
3.Bergmann, letter to the authors, 8 November 1972. A 
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that we are exposed to an onslaught of papers on Mach. Wigner, explains this 
huge output of papers dealing with Mach’s Principle (or rather contra Einstein 
and pro Mach) by the fact that ‘... the recent interest in the subject arose be- 
cause we have a good many younger physicists who are courageous enough to 
tackle this problem’. The lack of success of Wigner’s generation naturally dis- 
couraged most workers in this field. This, possibly, has led to the situation that 
-we are now exposed to an abundance of papers on Mach. 

Strangely enough, Dicke* has now given up his previous dedication to Mach’s 
Principle. Nonetheless, despite the plethora of papers in favour of Mach, Dicke 
himself does not regard this primarily as a lasting return to this controversial 
Principle! He too emphasises strongly that we can now recognise a marked shift 
away from our preceding strong interest in unified field theories. His new 
envisaged orientation may be worthwhile to be cited verbatim: 


The revolution that has occurred since 1960 is a shift from relativity as a 
mathematically formal science, divorced from the main stream of physics, to 
a science drawing inspiration from observations and, in turn, stimulating 
observations. ...[Mach’s] Principle seems not to have strongly affected 
recent developments. 


This new trend is—so he claims—not necessarily related to Mach at all. Although 
we are in agreement with Dicke as regards the empirical aspect of recent gravita- 
tional investigations, we cannot accept his views on Mach’s Principle. We take 
the position, with Wigner, that Mach’s Principle is still of fundamental importance 
in gravitational theory. 

Lastly, we feel very much obliged to Newburgh’s reply to our paper on Mach 
in this journal, because his rejoinder has suggested to us some novel ideas about 
the relation of Mach’s Principle, gravitation, and charged elementary particles. 
Another paper will be shortly published and transcend considerably the issues 
‘dealt with by our former article and Newburgh’s reply. For this ‘catalytic effect’ 
we are most grateful to him. 

JAMES F. WOODWARD 

California State University, Fullerton 
and 

WOLFGANG YOURGRAU 

University of Denver 
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LOGICAL COMPARABILITY AND CONCEPTUAL DISPARITY 
BETWEEN NEWTONIAN AND RELATIVISTIC MECHANICS 


Articles and notes published over the last ten years or so and dealing with the 
question of so called incommensurable theories in science have resulted from the 
controversy over the structure, changes and rational evaluation of scientific 
theories. Under attack have been a number of philosophical views of science: 
from Logical Empiricism, mainly in the works of Carnap, Reichenbach and 
Hempel, through a variety of empiricist doctrines propounded by professional 
philosophers e.g. Braithwaite and Nagel, or by some physicists e.g. Bridgman, 
Bohr, Einstein, to earlier critics of Logical Empiricism e.g. Popper. The attack 
was originally launched by Hanson, Toulmin, Kuhn and Feyerabend, but many 
others joined in. 

Some criticisms were meant to apply exclusively to the ‘logical approach’ 
identified with Logical Empiricism. So, for example, it was suspected that 


‘... the view that scientific theories are interpreted axiomatic systems may 
have blinded its adherents to many of the functions of those theories and 
their components ...’, ‘.. . even the highly developed scientific theories on 
which the axiomatic approach concentrates may be inadequately treated 
when looked upon as mere interpreted axiomatic systems. For the logician 
deals with those theories and their constituents as static, frozen in a logical 
mold; but perhaps there are more ‘dynamic’ functions such an approach 
tends to make us overlook . . .’ (Shapere [1965], p. 28). 


One criticism, however, was meant to apply not only to Logical Empiricism 
but to all those mentioned as being under attack: it is, that they have failed to see 
the existence and importance in science of rival, incommensurable (logically and 
empirically non-comparable) theories and that—owing to that failure—they have 

_ overrationalised their philosophic accounts of science, seeing or demanding 
logical relations (consistency, incompatibility, deducibility, reducibility, defin- 
ability etc.) where there are none. The critics seem to regard as a new and impor- 
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tant insight and as their contribution to the philosophy of science the claim that 
so-called revolutionary changes in science are disruptive, since they bring about 
radical changes in fundamental scientific concepts owing to which either piece- 
meal or even wholesale logical comparisons of rival theories may be impossible; 
this claim sets, according to critics, limits to rational (here: logical and empirical) 
arguments in science. 

In two previously published articles (Giedymin [1968] and [1970]) I pointed 
out that the so-called ‘incommensurability thesis’ both in its general form, że. as 
the claim that there are incommensurable theories in science, and in its specific 
applications, e.g. the claim that Newtonian and Relativistic Mechanics are such - 
incommensurable theories, is certainly not a new insight, since it was explicitly 
formulated and systematically discussed by the author of ‘radical convention- 
alism’, Ajdukiewicz in three articles published in the Logical Empiricist 
‘Erkenntnis’ in the early nineteen-thirties. For completeness sake I should like to 
add now that the general thesis dates back to the turn of the century when it was 
formulated and discussed in the controversy between Poincaré and LeRoy 
(Poincaré [1g05]), the latter arguing in its favour and the former against. 
Ajdukiewicz (Ajdukiewicz [1934]) resuscitated LeRoy’s thesis, based it on a more 
precise (pragmatic) conception of language and meaning, mentioned Newtonian 
and Relativistic Mechanics as instances of not intertranslatable languages and 
finally, in 1936, abandoned the thesis of ‘radical conventionalism’ as too extreme 
and untenable! It should be obvious, therefore, that Logical Empiricists— 
thanks to the influence exerted on the philosophy of modern empiricism by 
Conventionalists—were not only familiar for quite a long time with the idea of 
‘incommensurability’ (non-translatability, conceptual disparity) and disruptive 
changes in science but apparently came to the conclusion that the radical ‘in- 
commensurability thesis’ was untenable. 

The present note is intended to draw the reader’s attention to the analysis of 
the relation between Newtonian Mechanics and Special Relativity Mechanics 
(henceforth to be referred to as NM and SRM), given by Philipp Frank, one of 
the classics of Logical Empiricism (in Frank [1938]. Frank’s analysis of 
the relation between NM and SRM is interesting in many ways. Firstly, it 
shows clearly that problems of disruptive changes and of conceptual disparity 
were known to and discussed by Logical Empiricists and that, therefore, some of 
them at least were not at all blinded to ‘dynamic’ problems of changes in actual 
science by their view of physical theories as interpreted axiomatic systems? and 
did not deal with those theories as ‘static, frozen in logical mold’. Secondly, it 
reveals some of the reasons for the Logical Empiricist rejection of what came to 
be known later as the incommensurability claim. 

Frank, just as Ajdukiewicz and many others before him, was aware of the con- 
ceptual disparity between NM and SRM. He discussed in detail the syntactical 
and semantic differences between the fundamental concepts of the two theories 
(‘mass’, ‘time distance’, ‘length’, ‘force’) on which claims of indefinability, 
impossibility of translation etc. between the two theories have been based. On the 
other hand, on Frank’s account NM and SRM are mutually inconsistent and, 


1 I have traced the history of the problem in the writings of conventionalist philosophers. 


in my [1974]. . 
2 Cf. Frank [1938], chapter 2. z 
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therefore, logically comparable in spite of conceptual disparity ; or—more interest- 
ingly—Frank attempted to show that partial conceptual disparity may result from 
logical incompatibility of two rival theories, without making the apparently incon- 
sistent claim that NM and SRM are both incompatible and incommensurable. I 
am going to suggest in the present note that Frank’s account of the relation 
between NM and SRM can be generalised, under certain assumptions, to similar 
cases of rival theories using Carnap’s ideas of indirect interpretation of theoretical 
terms and of the meaning postulates formulated in terms of the Ramsey sentence 
RT of a theory T. This will show, I hope, how certain conclusions drawn from 
case studies of actual scientific theories and general logical considerations fit 
together within the Logical Empiricist account of science. 

Frank’s analysis of the relation between NM and SRM is, in outline, as follows. 

NM and SRM are mutually inconsistent, therefore, logically comparable. For 
from SRM one can deduce the negations of certain empirical laws which have to 
be assumed as valid in NM in order to ensure the uniqueness of the ‘operational’ 
definitions of the fundamental concepts of NM, ‘mass’, ‘time-interval’, ‘distance’, 
‘force’. If SRM is true (or, if SRM is assumed hypothetically as true), then those 
empirical laws, assumed in NM to define the mentioned concepts, are false (or, 
have to be assumed as false), Consequently, Newtonian definitions no longer 
satisfy the condition of uniqueness, the concepts in question ‘have no operational 
meanings’, as Frank, following Bridgman, put it, or—to use different words—are 
‘vague, t.e. have varying denotations (extensions). To become empirically mean- 
ingful from the point of view of SRM, they have to be re-defined. On the other 
hand, if NM is true (or, if NM is hypothetically assumed as true), then the 
concepts of Newtonian mechanics do have empirical meanings, i.e. they have 
unambigously fixed physical denotations. It follows that if SRM ts true (or is 
assumed to be so), then the Newtonian concepts cannot have the same interpretations 
(in the extensional sense) which they are supposed to have on the assumption that 
NM is true. To conclude, according to Frank, there is conceptual disparity 
between NM and SRM which, however, far from making the two theories logically 
and empirically ‘incommensurable’, is due to the mutual inconsistency (i.e. logical 
comparability) between the two theortes. 

Let us recall some of the examples of such conceptual changes discussed by 
Frank. Consider first the definition of ‘one hour’ in terms of ‘the time during 
which the big hand of our pocket watch traverses an angle of 360 degrees’. 
Obviously, we mean that any pocket watch can be used and not necessarily ours. 
However, this amounts to assuming that ‘the hands of all pocket watches proceed 
with one and the same angular velocity’, which is a statement of a physical law 
about the behaviour of watch springs. Moreover, we mean any clock to be used, 
e.g. a pendulum clock and this, in turn, amounts to defining ‘one hour’ as ‘a 
duration of a certain number of oscillations of a pendulum’. These, apparently 
different definitions are equivalent if the following law is valid: “The unwinding 
of a spring as an effect of its elasticity proceeds at a rate which is proportional to 
the frequency of the pendulum as an effect of gravity’ (Frank [1938], p. 432) 
assumed in NM. Now, from the postulates of SRM it follows that a clock which 
travels with the speed v relative to S looses time compared with a clock at rest in 

‘S. Consequently: 


+ «».s8ome operations which rendered, according to Newton’s law, identical 
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results no longer do so if Einstein’s principles are assumed to be true, The 
operations by which the time distance between two events was defined did 
not mention the speed of the clock relative to any system of reference. For, 
according to Newton’s physics, this speed is without influence upon the 
march of the clock. If Einstein’s principles are true, this operational definition 
of the time distance between two events becomes ambiguous. We must 
specify the speed of the clocks used in measurement. In order to obtain an 
unambiguous result of our defining operation, we must no longer say that 
‘between the events A and B there is time distance of 10 seconds’ but that 
‘there is a time distance of ro seconds if we use clocks which are at rest in a 
particular system 5’. The velocity of § relative to S must be specifically 
given. We use a ‘relativized language’ in order to make the description and 
the operations unambiguous (Frank [1938], p. 456). 


Again, consider Newton’s Second Law as a definition of ‘force’ and then of 
‘mass’. To make the formula ‘f = ma’ an unambiguous definition of ‘P’ ‘. . . cer- 
tain physical effects must be confirmed’, namely ‘... the product ma has to be 
independent of the mass m and has to depend on the situation of the moving body 
in its environment...’ (Frank [1938], p. 441). Alternatively, assuming that the 
field of force is known, e.g. given by Coulomb’s law, one gets the mass m by 
measuring the acceleration a: 


... According to Newton’s mechanics the result is independent of whether 
the initial velocity was small or great. But if Einstein’s principles are right, 
this operational definition becomes ambiguous. The acceleration a (and, 
therefore, m) depends actually upon from what initial velocity we start the 
experiment. In order to obtain an unambiguous result, we have to specify 
the operation involved, in particular the initial velocity v. If we require that 
the initial velocity be zero relative to (the fundamental system) S, the 
acceleration becomes unambiguously determined. We must, therefore, use a 
modified operational definition of ‘mass’. We can either make the specifica- 
tion that the initial velocity relative to S is zero, then we define a concept 
which is called ‘rest mass’ mo. Or we can include the initial velocity v in the 
description of the operation. Then acceleration and mass themselves become 
dependent on v. We obtain a physical quantity which is no longer a con- 
stant but a function of v. This quantity is called ‘mass’ in the new mechanics. 
By using this definition we can formulate the laws of motion in the simple 
form: mass times acceleration equals force (ma = f). But the mass m is now 
a function of v (Frank [1938], pp. 455-56). 


I shall now recapitulate some of the main points of Frank’s analysis and make 
a few comments: 

(z) On Frank’s account NM and SRM are mutually inconsistent and, therefore, 
logically comparable. In spite of this, or rather owing to this. fact, there is con- 
ceptual disparity between the two theories, t.e. the specific, theoretical concepts 
of NM have undergone changes in the transition to SRM. This claim, that NM 
and SRM are logically comparable and yet to some extent conceptually dis- 
parate, is—of course—not peculiar to Frank’s viewpoint. ° ; 

(2) The ‘mechanism’ of conceptual change in the transition from NM to 
SRM is, according to Frank, as follows: 3 
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Some empirical laws have to be valid to ensure the uniqueness of the definitions 
of metric terms. If those laws happen to be false, the terms in question become 
vague and lose their empirical (‘operational’) meaning. If a theory T” logically 
implies the negation of the laws assumed in another theory T to define the terms 
of T, then from the viewpoint of T’ those definitions have to be corrected (e.g. 
relativised) and so the terms in question have to be re-interpreted. 

SRM implies the negation of certain empirical laws assumed in NM to ensure 
the uniqueness of the definitions of ‘mass’, ‘distance’, ‘time interval’, etc. So, for 
example, the sentence ‘A clock which travels with the speed v relative to S loses 
time compared with the clock at rest in S’ implies the negation of the sentence 
“The speed of the clock relative to S is without influence upon the march of the 
clock’; the former sentence is a consequence of SRM and the latter is implicitly 
assumed in NM. 

The empirical meaninglessness of Newtonian concepts from the viewpoint of 
SRM is the result of factual considerations, t.e. of the denial of certain empirical 
laws assumed in NM. 

To remove the vagueness of Newtonian concepts one has to take into account 
certain factors disregarded in NM and this will affect also the syntax of the 
language, e.g. Newtonian ‘mass’ is an expression of the form ‘m(x) = y’, whereas 
relativistic ‘mass’ is an expression of the form ‘m(x, v) = y’. 

(3) The examples discussed by Frank fall under a schema and can be general- 
ised using Carnap’s ideas of ‘reduction sentence’, ‘meaning postulate’ and ‘the 
Ramsey sentence of T’. 

If a term ‘?’ is introduced into a theory T with the help of one reduction sen- 
tence of the form: 


If O,(x), then (t(x) iff O,(x)) 


then ‘?’ remains uninterpreted (i.e. has no empirical meaning) whenever O,(zx) is 
not satisfied. Similarly, if an observational consequence of the reduction sentence 
or of a pair of reduction sentences introducing ‘t?’ turns out to be false, then again 
‘# remains empirically uninterpreted. (Carnap [1936] and [1952]). Now, defini- 
tions of metric terms such as ‘mass’, ‘distance’, ‘time interval’, etc. are essentially 
of the same form as the above reduction sentence, except that the definiendum is 
an expression of the form ‘m(x) = y’ or ‘m(x, 0) = y’ etc. while the definiens 
specifies the measuring operation, the measuring apparatus and its behaviour 
(e.g. pointer-readings). 

In general, if RT is the Ramsey sentence of a theory T, expressing T’s observa- 
tional content, then the conditional ‘"T—-T” is the analytic component or mean- 
ing postulate of T, intended to provide the theoretical terms of T with empirical 
interpretations (in the extensional sense). If, therefore, &T happens to be false, 
the theoretical terms of T remain completely vague, i.e. have no empirical 
interpretation. Now, we may have grounds for accepting the negation of ET 
either on the basis of direct falsification of some of its consequences or else on the 
basis of accepting another theory T” which logically implies the negation of 8T. 
The latter would presumably be the case of NM and SRM on Frank’s account, 
with the already mentioned consequence that from the point of view of SRM 


1 Cf. similar claims in Einstein [1916], pp. 24, 27, 30; Griinbaum [1954]; Feynman [1966], 
* p. 162. 7 
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the theoretical terms of NM become empirically uninterpreted (we identify here 
Frank’s ‘has no operational meaning’ with ‘is empirically uninterpreted’ or (‘is 
completely vague’). 

Needless to say, at the time when Frank’s article appeared (in 1938) only the 
older version of Carnap’s reconstruction of empirical theories (Carnap [1936] 
and [1937]) was known in which just two components of theories were distin- 
guished, viz. the uninterpreted calculus (specific to the given theory, e.g. 
Maxwell’s equations of the electromagnetic field, as well as the mathematics and 
logic necessary for making deductions) and ‘semantical rules’ identified by Frank 
with ‘operational definitions’ for metric terms. The third component, correspon- 
dence principles, came to be distinguished by Carnap in his postwar works. 

(4) Frank’s claims that NM and SRM are mutually inconsistent and that some 
conceptual changes in the transition from the one to the other were due to this 
inconsistency are based on an intuitive formulation (one of several possible) of 
both theories and on the intuitive concepts of deducibility and inconsistency. 
Presumably Frank believed that in spite of partial conceptual disparity the 
languages of the two theories do have some sentences in common; sentences 
concerning the behaviour of the measuring instruments (rods, scales, clocks, 
etc.) i.e. pointer readings, were classified by him as common to both theories. It is 
plausible, however, to interpret him as saying that also some of the more 
theoretical sentences, which depend on background theories, are shared by NM 
and SRM. Following Einstein! and others, Frank discussed the relation between 
NM and SRM from the ‘genetic’ viewpoint, t.e. from the point of view of how 
Relativity Theory ‘grew out of’ NM, Electrodynamics and Classical Optics 
(Frank [1938], p. 19). This approach, which by the way is within the bounds of 
‘diachronic logic’? may have been one of the reasons why Frank saw the two 
theories as logically and empirically comparable rather than ‘incommensurable’, 
For on this approach Einstein retained certain components of classical physics, 
viz. the principle of relativity and the principle of constancy of light velocity 
(independence of source velocity); the replacement of Galilean transformation 
by Lorentz transformation to relate space and time measurements in inertial 
frames, accounts for the conceptual changes mentioned before, since the former 
was based on two assumptions rejected in SRM, viz. ‘(r) the time interval (time) 
between two events is independent of the condition of motion of the body of 
reference, (2) the space interval (distance) between two points of a rigid body is 
independent of the condition of motion of the body of reference’ (Einstein 
[1916], p. 30) and since the Einsteinian modification of the concept of mass was 
sufficient to make the laws of mechanics covariant with respect to Lorentz 
transformation between inertial frames. Frank’s view of the logical relation 
between NM and SRM was thus in line with the tradition, started by Einstein, 
according to whom the Special Theory of Relativity ‘... has... been developed 
from electrodynamics as an astoundingly simple combination and generalisation 
of the hypotheses, formerly independent of each other, on which electrodynamics 
was built’ (Einstein [1916], p. 41). According to the same tradition, although both 
Special and General Theories of Relativity ‘. . . possessed a decidedly revolution- 
ary appearance when they were announced, it has now become clear that they 
represent the natural termination for the classical theories’ of mechanics and , 


1 Einstein [1916], chapters 5 and 6- * Cf. Suszko [1968]. 
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electromagnetism, ‘rather than a break with these systems of ideas and the 
inception of a new line of thought’ (Lawden [1971], p. viii). 

(5) When discussing conceptual changes in the transition from NM to SRM 
Frank did not use any clear-cut conception of meaning. He did approve of the 
basic ideas of Operationalism, in particular of the requirement that a (metric) 
concept in physics, to be empirically meaningful, has to be associated with 
measurement operations which (within limits of experimental error) give 
unambiguous results. However, he did not follow Bridgman in claiming that 
distinct methods of measurement, e.g. of distance, time interval, temperature, 
etc. yield different concepts, e.g. of space, time, temperature, etc. (Bridgman 
[1927], pp. 66-91), which could have formed — under suitable additional assump- 
tions—a premiss for concluding that NM and SRM were conceptually completely 
disparate or ‘incommensurable’. Nor did he appeal for that purpose to the 
syntactic difference between the concepts of NM and SRM, quite rightly, so it 
seems, since one can always relativize Newtonian concepts in an ‘inessential’ 
way by introducing strictly redundant terms. 

Questions of definability (e.g. of Newtonain ‘mass’ in terms of relativistic 
‘mass’) and of translation (e.g. from NM to SRM) cannot be reliably answered 
in an intuitive, informal context, i.e. without more rigorous formulation of the 
theories in question and without conventions concerning the analytic compon- 
ents (meaning-postulates) of those theories. It is possible, therefore, that—under 
suitable assumptions—the language of SRM may be shown to be closed, in 
Ajdukiewicz’s sense (Ajdukiewicz [1934]), with respect to the theoretical sub- 
language of NM, i.e. the former cannot be enriched with the concepts of the latter 
without either modifying some of the concepts or making the enriched sage 
-disconnected. 

JERZY GIEDYMIN 
University of Sussex 
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Review Articles 


THE TROUBLE WITH QUANTA! 


The title, ‘Paradigms and Paradoxes’, of the fifth volume of the University of 
Pittsburgh series in the Philosophy of Science is too trendy by half. Since Kuhn’s 
idiosyncratic use of the word ‘paradigm’ in his ‘Structure of Scientific Revolu- 
tions’ the word has lost significance in later writing. The title ‘Paradoxes’, on 
the other hand, is to the point. Basically, five of the six authors are trying to come 
to terms with the famous Einstein-Podolski-Rosen paradox. This is not, of 
course, a formal paradox in the strict sense of being a statement that is true if 
false, and false if true. Rather, it highlights counter-intuitive consequences of 
present-day quantum mechanics. 

The bulk of the book, then, is a discussion of what might be described either 
as the Einstein-Podolski-Rosen paradox, or the theory of measurement, or the 
scandal of quantum mechanics. It is not the transcript of an actual discussion 
between ‘the five authors, but rather a collection of essays based on lectures 
delivered at Pittsburgh. Since, however, the authors are well-known members of 
the trade union of philosophers of science concerned with quantum theory, there 
is a good deal of interaction between them, as witnessed by many cross-references 
in the voluminous footnotes. 

Though requiring little knowledge of formal logic or mathematics, the book is 
not easy to read. Moreover, it can hardly be regarded as, and is presumably not 
intended to be, an introduction to the relevant parts of quantum mechanics. To 
appreciate the problems the general reader in philosophy may need some 
introduction. 

Philosophy of science is an essential part of philosophy. The position of the 
philosopher of science is a delicate one: To what extent should he be sensitive 
to the results of actual work in science, and in particular contemporary science? 
We would rightly view with suspicion any philosopher whose philosophy is too 
dependent on stop-press information from scientists. On the other hand, it is the 
case that results of scientific work have proved to be of profound philosophic 
significance. This is true of the three great revolutions in physical science in the 
first third of this century, and of quantum mechanics in particular. Equally 
remarkable is the fact that there has been no progress in fundamental physics 
since then, in spite of the scandalous situation in which quantum mechanics left 
it. Previously, since Galileo, theories were transformed continuously, and pro- 
gress was greatly assisted by the recognition of ‘paradoxes’ in the available theory. 
(I agree with Howard Stein that we may learn from the history of Maxwell’s 
theory of electromagnetism.) 

What is the profound revolution of philosophic significance brought about 
by quantum mechanics? Like the relativity revolution, it depends on certain 


1 Review of R. G. Colodny (ed.) [1972]: Paradigms and Paradoxes: The Philosophical A 
Challenge of the Quantum Domain. Pittsburgh: University of Pittsburgh Press. $14.95. 
Pp. xix +446. 
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empirical results.’ Some commentators, and, presumably unintentionally, even 
Heisenberg, an author of the revolution, tend to give the impression that al! that - 
is necessary is to take thought to realise the universal applicability of the un- 
certainty principle to measurement. This impression that all that was needed 
in addition to classical theory was philosophic clarity in order to arrive at the new 
theory, is, of course, false. It is true that there may be a gain in symmetry in the 
new theory and that such ‘aesthetic criteria’ may have been historically influen- 
tial in the heuristic situation, but the decision to adopt these theories is empiri- 
cally based. In the case of quantum theory (arising from a study of optical and 
thermal data), the decisive phenomena included the photo-electric effect and 
electron diffraction. It is, on the one hand, the inadequacy of the classical field 
description in dealing with localised effects best described in terms of atoms of 
light, of photons, and on the other hand, the inadequacy of the classical particle 
description in dealing with interference effects of the ‘electron-wave’, that have 
forced the quantum mechanics revolution on us. ‘Complementarity’ is to be 
regarded as a post hoc philosophic apologia, however much Bohr (a revolutionary 
à outrance) may have inclined that way all along. The quantum mechanics 
revolution, basically, consists in the introduction of incompatible elementary 
parameters into the descriptive repertoire of the theory. This situation, that we 
may with perfect accuracy assign a certain value to one parameter or to another 
parameter of a system but not to both simultaneously, is new, and in some sense 
signals the end of realism. It is for such reasons, finding certain universally 
implied assumptions no longer applicable, that physics, and quantum mechanics 
in particular, offers important lessons to philosophers. 

Of course, no finite number of data (or even an infinite number) actually 
forces us to subscribe to any given theory. It is just that quantum mechanics, 
like the Prime Minister, is the best we have. In spite of great efforts no viable 
alternative has been found. Indeed, there is now experimental evidence refuting 
a large class of ‘hidden variable’ theories such as might be introduced to restore 
determinism to quantum phenomena. 

What, then, is the objection to quantum mechanics? It is refutable, but has 
survived sincere attempts at refutation; it is predictive and has successfully 
predicted a vast range of phenomena; it is in many ways simpler than classical 
mechanics, allowing a far greater role to symmetry; it is consistent (provided we 
admit an unproved general theorem that the phase of the wave function of a 
system is completely destroyed when the system interacts irreversibly with a 
measuring apparatus). Criticisms of quantum mechanics concentrate on the 
charge that it is incomplete. This charge seems to me obviously justified. 
Although quantum mechanics necessarily includes references to individual 
experiments in its account, it cannot predict the outcome of single experiments, 
being ‘merely’ a probabilistic theory. The obvious remedy for this incomplete- 
ness is to supplement the parameters admitted by quantum mechanics with so- 
called ‘hidden variables’ converting the probabilistic quantum mechanics into a 
completely deterministic new theory. There is not, of course, any general reason 
why there should not be hitherto hidden parameters affecting the outcome, of 
experiments. The trouble is that if we introduce a sufficient set of effective para- 
meters to determine outcomes uniquely, we are committed to consequences 
clashing with empirically confirmed results of quantum mechanics. Recently 

- ‘crucial’ experiments have been performed, constructed precisely on the lines of 
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the Einstein-Podolski-Rosen paradox,! deciding in favour of the counter- 
intuitive results of quantum mechanics. 

The present book concentrates on related scandals of long standing. 

In order to account for the seemingly incompatible outcomes of different 
experiments on apparently identical systems (localised detection of photons vs 
interference effects of waves) we make the predictions probabilistic and depend- 
ent on the experimental environment. This implies that on measurement of a 
particular parameter the Schrödinger wave function describing the system, 
which in general shows a spread corresponding to finite probabilities of different 
outcomes of measurement, ‘collapses’ into a state corresponding to the definite 
result of the measurement. This ‘collapse’ of the wave function, corresponding to 
additional information, is quite distinct from motion following the Schrödinger 
equation (the quantum mechanical equation of motion) and has to be accepted by 
those who accept the probabilistic interpretation of the wave function. No 
unification of the two categories of change (Schrödinger motion and collapse) is 
to be looked for. To attempt to reduce the epistemological phenomenon of in- 
creased information to the mechanical interaction with the apparatus would be, 
in my opinion, as ill-conceived as to reduce the ethical concept of free will to the 
existence of a finite irreducible elementary volume #* in phase space. The so- 
called theory of measurement is a curious mixture of epistemological desiderata 
and strict consequences of the mathematical formalism: We know what we 
require of a perfect measurement, but it remains to be shown, and has not been 
shown, that these requirements can be met by any explicit interaction with the 
apparatus. Moreover, there is a good deal of hand-waving in introducing the 
essentially ‘classical’ behaviour of the apparatus. The theory of measurement 
leads to a number of consequences of which the Einstein-Podolski-Rosen 
paradox is perhaps the worst. The various aspects of the latter are discussed in 
detail by Hooker. Perhaps the most counter-intuitive aspect is the (statistical) 
correlation of measurements carried out under circumstances where quantum 
mechanics denies any interaction. “This suggests the following incredible picture: 
the state of a physically isolated system . . . can be affected by choice of measure- 
ment on another physical system having no physical interaction with it.’ Note 
that this effect hides shyly in statistics (like that other ‘psi-phenomenon’, psycho- 
kinesis). Indeed, the effect only shows up as correlation. The ‘paradox’ only 
arises in the theory, which makes the result of the measurement on the second 
system differ according to our choice of first measurement. Since we cannot make 
more than one choice, no contradiction results. In classical theory the results of 
the second measurement do not depend on the choice of first measurement, and 
the correlations are, indeed, numerically different. 

Van Fraassen, Fine and Finkelstein offer their various balms for the wound of 
quantum mechanics, but no cures. To label the interpretation of quantum 
mechanics ‘modal’ is merely to embalm the existing problem of linking prob- 
ability to actuality. To define a logic to match quantum mechanics seems to 
side-step what problems there may be: one problem might have been to fit 
quantum mechanics into independently given ordinary logic. In discussing the 
two-hole experiment Fine points out that we need not define the probability of 
compound events to obtain a consistent probabilistic theory. But again, this is . 


1 See, for instance, Shimony [1971]. 
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defending quantum mechanics against the wrong charge. The charge is not one 
of inconsistency, but of leading to counter-intuitive consequences. Finkelstein, 
in a crisp essay, offers a neat lattice representation of some aspects of quantum 
mechanics. The unwary reader might be tempted to forget at this stage that quan- 
tum mechanics is a physical theory that crucially depends on certain empirical 
discoveries. Finkelstein’s exercise seems to me a formulation at the level of theo- 
retical physics (and its blind sister, applied mathematics), but not the creation of 
a new logic as he claims. I do not suppose that he seriously proposes to replace 
ordinary logic at all levels. However, I agree with Finkelstein, as against Van 
Fraassen, that if one wants to formulate a logic to match quantum mechanics (or 
indeed any physics) it is ‘operationally superior’ to start with qualities rather than 
things. This is analogous to the development of abstract atomism. 

It is a pity that Hooker did not find the time to write a shorter article. Cer- 
tainly he deserves the thanks of all concerned for his exhaustive bibliography. 
There are many seminal points in the first part of his essay, but the main message 
lies in the second part, in his analysis of Bohr’s epistemological holism. He is 
making an interesting point when adducing classical examples of complemen- 
tarity in wave optics (polarisation). But since in classical physics the values of the 
components have direct physical significance as the results of measurement, there 
is no collapse of the wave function, and no paradoxes arise. (Incidentally, the 
large range of quantum optics covered by classical wave optics merits analysis.1 
It may turn out that Schrédinger’s preference for a wave model is justified after 
all.) In general the famous philosophy of complementarity seems to me merely 
a restatement of the problem. On the whole I sympathise more with Howard 
Stein, who feels that the problems posed by quantum mechanics should be 
faced. 

Stein is very properly impressed by the problem of interpretation of a physical 
theory, a problem that is central to philosophy of science. Since almost all 
problems can be swept under the carpet of interpretation (e.g. the indeterminism 
of quantum mechanics is camouflaged in the deterministic Schrödinger ‘wave’ 
equation) it is difficult to assess the value’ of ‘axiomatisation’ in the remaining 
formalism. Admittedly such axiomatisation enables one to make metatheoretical 
remarks, including remarks concerning consistency, independence and symmetry, 
and may bring out hidden assumptions. But the heart of the problem is usually 
left under the heading of interpretation. Here again, the present reviewer has 
difficulties: What is the status of an exercise, such as Van Fraassen’s, of ‘re- 
interpreting’ a certain class of objects in the formalism, viz. the so-called ‘mixed 
states’. These were introduced within the orthodox theory, including a prob- 
abilistic interpretation, for the purpose of expressing our ignorance of certain 
aspects of a system. There does not seem to be any leeway in the interpretation. 
Possible differences in general interpretation of probability itself do not seem 
relevant, since even an objectivist allows changes in probability distribution in 
the light of further information. 

Stein seems ambivalent on the question of correspondence, że., the question 
how far classical physics can be regarded as a ‘limiting case’ of quantum 
mechanics. (In the view of the present reviewer, it cannot.) In any case, I regard 


1 See, for instance, Hanbury Brown and Twiss [1957], Sillitto [1960], Sillitto and Haig 
. [1968] and Sillitto [1971]. i j 
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it as a flaw that the classical point-function expression for a law such as Coulomb’s 
should have a place in quantum mechanics. 

Incidentally, I advocate that ‘Mach-baiting’ by philosophers of science should 
now cease. On the dubious grounds that Mach and Ostwald were ‘wrong’ in 
their opposition to the contemporary atomists (led by Boltzmann), the moral is 
drawn that one should behave as differently as possible from these ‘reactionaries’. 
A comparison of the inaugural lectures of Mach and Boltzmann should convince 
the reader that Mach at any rate was a philosopher, and that Boltzmann was not. 

This is not to assert the primacy of philosophy over science: far from it. I find 
several very interesting passages in this book to support my view that the inno- 
cent student of quantum mechanics is sold a package including neo-Kantianism, 
whether he likes (or even notices) it or not. I see it as a task for philosophers of 
science to unwrap this unwelcome gift parcel and try to separate neo-Kantian 
prejudices from scientific results. This seems to me a more worthwhile exercise 
than the attempts in this book to produce a premature unity of philosophy and 
science. ‘Feindschaft sei zwischen Euch, noch kommt das Biindniss zu frühe; 
wenn ihr im Suchen euch trennt, wird erst die Wahrheit erkannt.’ 

Past history of science discredits the argument from predictive success of a 
theory to correctness of the fundamental ontology. It is precisely the top levels 
of a theory that are changed by revolution. Progress out of the stalemate in 
physics may be on the lines called for by Feinberg, by insisting on the con- 
struction of intuitively acceptable models. It is arguable (though this is not put 
by Feinberg) that instead of forcing elementary particle theory into the frame- 
work of quantum mechanics, elementary particle theory might lead directly to a 
new, better theory replacing quantum mechanics. 

Only a non-linear global deterministic theory may include in principle the 
actions of the experimenter. As long as we retain linear probabilistic theory the 
intervention of the experimenter cannot be included. Modern physics has trod 
again the path of atomism from Democritean determinism to Epicurean swerves. 
It also suffers from the ‘independence’ difficulty of any linear theory. It is more 
a reflection on the state of physical science for the last forty years, than on 
philosophy of science, that the reader. s conclusion (in the apie of this book 
which abounds in quotations) is 

‘Myself when young did eagerly frequent 
Doctor and Saint and heard great Argument 
About it and about: but evermore 

Came out by the same Door as in I went.’ 


HEINZ POST 
University of London 
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X INTRODUCTION 


Several years ago members of the United States National Committee for the 
International Union of History and Philosophy of Science became officially 
concerned with the rationale for the union of History of Science and Philosophy 
of Science. This concern was, not surprisingly, channelled into the organisation 
of a conference (held at the University of Minnesota in the fall of 1969) and into 
the publication of papers. The original idea was to have pairs of historians and 
philosophers address limited and well-defined topics in science. These studies 
were to pave the way for higher level discussions of possible relations between 
historical and philosophical approaches to an understanding of the scientific 
enterprise. In actuality, only remnants of the original plan survived. What we 
have instead is a collection of thirteen papers, related to be sure, but hardly 
focused on a few well-defined topics. There are five papers (Feigl, Feyerabend, 
Hesse, McMullin, Thackray) dealing explicitly with the nature of history or 
philosophy of science and their mutual relations. Four of the papers (Hiebert, 
Rosen, Stein, Stuewer) are nearly pure history of science. Similarly, one paper 
(Salmon) is a purely philosophical study. Finally there are three papers (Achin- 
stein, Buchdahl, Schaffner) which argue methodological theses using historical 
case studies for illustration and support. Six of the papers (Achinstein, Buchdahl, 
Hesse, Schaffner, Stein, Thackray) are followed by comments and replies, some 
of which are quite substantial.* 

I will begin by commenting on the historical and philosophical contributions 
and then take up the papers focusing on history or philosophy of science and their 
mutual relationship. After summing up my reactions to the volume as a whole I 
1 Review of R. H. Stuewer (ed.) [1970]: Historical and Philosophical Perspectives of Science, 

Minnesota Studies in the Philosophy of Science, 5. Minneapolis: University of Minnesota 
Press, $11.50. Pp. 384. 
2 One might question whether these papers exhibit sufficient unity of content or approach 
to justify inclusion in a single volume. But as this volume has greater unity than some 
- other recently published collections in the philosophy of science, it seems unfgir to 
pursue this question here. In the end, it seems such matters must be left to editors, 
«referees and publishers. $ 


History and dhelap of Science 283 


willsketch an alternative analysis of the philosophy of science ET its ieladons with 
history of science. My comments throughout will reflect an interest in this latter 
issue. Having spent my professional life thus far in a department of history and 
philosophy of science, I have often had occasion to wonder if the union is not 
primarily a marriage of convenience. It may be better than living with one’s 
parents, history and philosophy respectively, or with one’s rich relatives, the 
sciences. But does it have the passionate involvement and deep communication 
that one was led to expect? The overall situation is, I think, less reassuring than 
it might seem to many readers of this volume. 


2 NOTES ON HISTORICAL AND PHILOSOPHICAL CONTRIBUTIONS 


Turning first to primarily historical papers, Erwin Hiebert’s ‘Mach’s Philo- 
sophical Use of the History of Science’ attempts to trace the relationships among 
Mach’s works as a physicist, historian of science, and philosopher of science. 
Hiebert argues that Mach was primarily a physicist whose interest in the history 
of science was at first pedagogical and only gradually came to be viewed as essen- 
tial for an adequate understanding of scientific concepts. Mach’s methodological 
reflections, Hiebert claims, ‘were the out-growth of the scientific research in 
which he was engaged most of the time’ (p. 192), and were thought by Mach to be 
justified by history (p. 197). Clearly Mach’s thought provides a case study of the 
relation between history and philosophy of science. It is not clear from Hiebert’s 
paper, however, that Mach’s philosophical views, e.g. on the epistemological role 
of sensations or the economy of thought, have any ‘logical’ or ‘conceptual’ 
relations to either his physical or historical investigations, and if so, what pre- 
cisely these relations might be. 

Edward Rosen’s ‘Was Copernicus a Hermeticist?’ follows the paper by Mary 
Hesse in which the influence of the Hermetic tradition figures as the primary 
example in a historiographic debate. Rosen directs a battery of quotations at 
Francis Yates and concludes among other things, that ‘the hermetic association 
amounts to about 0.00002 of [Copernicus’s] Revolutions’ (p. 169). Whether he 
carries the day in this particular dispute I leave for others to judge. 

Howard Stein’s ‘On the Notion of Field in Newton, Maxwell and Beyond’, is 
part of a projected longer paper on the same topic. The section on Newton is 
substantial; the section on Maxwell and beyond is rather more sketchy. Stein’s 
thesis regarding Newton is that the concept of a field, though of course not by 
this name, is essential to Newton’s work at three levels: heuristic, theoretical, and 
metaphysical. Stein’s commentators, Mary Hesse and Gerd Buchdahl, raise 
several substantial and detailed objections which elicit some sharp replies. I am 
not competent to judge the issues concerning Newton. It is interesting to note, 
however, that Stein’s paper provides an example of that mode of approaching 
historical questions which by and large presupposes current philosophical 
categories. Thus Stein brings to his reading of Newton the distinctions among 
‘heuristics’, ‘fundamental theory’ and ‘metaphysics’. This is not to say that the 
philosophical categories are rigidly held or rigidly applied, but they clearly are 
tools in the historical investigation and not products of that investigation.t 
1 Yet’ Stein’s introduction contains the provocative parentheses: “Data for the philosophy 


of science can come only from the history of science’ (p. 265). Presumably the categories 
he uses here are based on other historical cases. 
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Roger Stuewer’s ‘Non-Einsteinian Interpretations of the Photoelectric Effect’ 
is a straightforward attempt to explain why it took from rgo5 until the publica- 
tion of Compton’s work in 1923 for Einstein’s light quantum hypothesis to gain 
general acceptance by the physical community. The answer has several parts. 
First, there was naturally great reluctance, after Hertz, to believe that Maxwell’s 
theory was inadequate. Second, the experimental evidence was inconclusive 
regarding Einstein’s prediction of a linear relation between the frequency of 
incident radiation and the maximum photoelectron energy. Finally, there were a 
number of ‘classical’ alternatives that also accounted for what data there was. 
Stuewer discusses theories developed by Lorentz, Thompson, Sommerfeld, and 
Richardson. The first three were abandoned in 1914 with the acceptance of the 
Bohr model of the atom. Richardson’s wholely macroscopic account survived 
until Compton. If there are any global historical or philosophical theses to be 
drawn from this episode in modern physics, Stuewer does not draw them. But it 
is a fascinating story nonetheless. 

Turning to the primarily philosophical papers, in ‘Inference to Scientific 
Laws’, Peter Achinstein suggests two forms of inference to scientific laws, 
explanatory and inductive. He also gives a pragmatic analysis of the distinction 
between the contexts of discovery and justification. If a scientist first becomes 
acquainted with a hypothesis while reasoning to it, that reasoning, for him, is in a 
context of discovery. Similarly for justification. It turns out that both explanatory 
and inductive inference may occur in either the context of discovery, or justifica- 
tion, or neither! Achinstein goes on to apply these distinctions to the- classic 
cases of Gay Loussac and Avogadro. The former is said to have reasoned in- 
ductively in both contexts; the latter explanatorily in the context of justification 
only. Now Achinstein claims that ‘doing these things should provide a better 
understanding of the origin of the law’ (p. 104). He also thinks that the formula- 
tion of philosophical distinctions and their application to historical cases are 
mutually interacting processes. In this essay, however, the ‘interaction’ is all one 
way. Moreover, Achinstein does not indicate the nature of the support which 
historical cases might lend to philosophical distinctions. 

That Achinstein’s analysis of the contexts of discovery and justification is not 
the usual one is clear from Wesley Salmon’s ‘Bayes’s Theorem and the History 
of Science’. For Salmon, a fact is in the context of justification or discovery 
relative to a given hypothesis according as that fact is evidentially relevant to the 
hypothesis or only psychologically relevant. Now evidential relevance is deter- 
mined by an inductive logic. Salmon argues that the inductive logic for theories 
is supplied by Bayes’s Theorem—and laments the fact that most historians seem 
to hold some version of the inadequate H-D view. He clearly thinks it important 
for historians that they recognize the inadequacies of the H-D view. Finally, 
Salmon provides the history of science with an unexpected role. Application of 
Bayes’s Theorem requires prior probabilities for theories. If these are to be based 
on experience, Salmon argues, they must come from our past experience with 
theories of various kinds, and this experience is recorded by historians. Thus the 
history of science provides a necessary ingredient in the logical evaluation (justi- 
fication!) of current scientific theories. It should come as no surprise that most 

` of this would be disputed by other inductive logicians. The fundamental -diffi- 
culty, I think, is this: For Salmon, the most historians can provide for prior 
probabilities is a rough estimate of the relative frequency of successful theories of 
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a certain type. The posterior probability he wants from Bayes’s Theorem, how- 
ever, is the relative frequency of true theories of that type. But past success is not 
the same as truth. Thus unless Salmon is willing to settle for a posterior pro- 
bability of success, and even this needs clarification, he must find some other 
account of the prior probabilities of theories. 

Kenneth Schaffner’s contribution is a tour de force entitled ‘Outlines of a Logic 
of Comparative Theory Evaluation with Special Attention to Pre- and Post- 
Relativistic Electrodynamics’. T'he aim of this paper, which runs forty pages, is to 
reconcile ‘historical’ accounts, in which rational comparison of incommensurable 
theories is impossible, and ‘logical’ accounts, in which experimental data provide 
sufficient assessments either through falsification or differential confirmation. 
Schaffner first sketches modifications of standard analyses of the meaning of 
theoretical terms, correspondence rules and relations between theory and observa- 
tion. He then introduces three criteria of evaluation adopted from the introduc- 
tion to Hertz’s Principles of Mechanics: (A) Theoretical context sufficiency, i.e. 
‘concordance between a theory to be assessed and the corpus of accepted scien- 
tific knowledge’ (p. 321); (B) Experimental adequacy; (C) Simplicity. While 
admitting that the three criteria do not ‘constitute a type of easily applicable 
schema’, Schaffner insists that they ‘do accurately characterize the process of 
comparative theory evaluation as it is practiced by scientists making the history 
of science’ (p. 330). The second half of the paper is an attempt to support the 
latter claim by comparing Lorentz’ electrodynamics with Einstein’s theory of 
relativity in 1905. After sketching each of the theories, he considers them in the 
light of each criterion and comes up with a ‘split decision’: ‘theoretical context 
sufficiency supports Lorentz’s theory, relative simplicity supports Einstein’s, 
and experimental adequacy . . . selects neither theory . . .’ (p. 347). 

In his exceptionally clear comments, with which I largely concur, Arnold 
Koslow objects (f) that Schaffner’s criteria pre-suppose and do not resolve the 
‘paradoxes’ of the historical school, (#) that Schaffner is wrong about the onto- 
logical status of the aether, and (##/) that the conclusion of a ‘split decision’ 
unjustifiably assumes weighing the criteria equally. Schaffner, of course, dis- 
agrees. The main difficulty, for me, is understanding the status of the three 
criteria. Are they primarily descriptive of criteria scientists in fact appeal to in 
comparing theories, or do they carry normative force. In the former case one 
would expect much effort to show that most scientists do use them. In the latter 
case one would expect some attempt to say why these criteria are appropriate, 
e.g. because they further the aims of theoretical inquiry. In fact we are offered 
only some of the former and none of the latter. The trouble is that without some 
claim of normative force, Schaffner’s views go little beyond those of the his- 
torical school—and the claim to a ‘logic’ of comparative theory evaluation is 
spurious. No one denies that scientists do in fact use various criteria to compare 
theories. The question is whether these are rational evaluations or merely means 
employed to persuade others to change their allegiances. In short, it is unclear 
whether Schaffner is doing logic or sociology, and until this is clear it will seem 
that he has reconciled the logical and historical schools only by ignoring their 
real philosophical differences. 

Gerd Buchdahl’s ‘History of Science and Criteria of Choice’ is clearly aiming - 
at results similar to Schaffner’s paper. Like Schaffner, Buchdahl employs a triad 
of criteria for choosing one theory over another, but here the criteria are organised 
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in a hierarchical structure. At the top, so to speak, is the archetonic component 
consisting of regulative ideals and preferred explanation types. Next is the level 
of explication which requires that theoretical concepts be intelligible relative to 
other relevant concepts. Third is the constitutive component of choice which 
includes the explicit systematic formulation of a theory as well as its confirmation 
by facts. Buchdahl defends his methodological structure by arguing that it pro- 
vides a way of understanding Newton’s perplexities about the nature and cause 
of gravity. Action at a distance seems demanded at the constitutive level by the 
empirical success of the theory, but it is hard to reconcile with pre-existing 
concepts of space and matter. Newton resolves this tension, Buchdahl argues, by 
appealing to the archetonic idea of final causes. Laudan disagrees with some de- 
tails of this analysis, but is in full agreement with the general enterprise. 

Unlike Schaffner, Buchdahl explicitly defends his criteria as being principles 
of rational choice and thus explicitly rejects the traditional expiricist position 
that the fundamental basis for choice of theories is ‘agreement with data’. I will 
take up this central issue below, after a brief survey of those papers which focus 
directly on the nature of history and philosophy of science and their mutual 
relations. 


3 CONTRIBUTIONS ON HISTORY AND PHILOSOPHY OF SCIENCE AND 
THEIR MUTUAL RELATIONS 


Herbert Feigl’s introductory remarks at the Minnesota conference, ‘Beyond 
Peaceful Coexistence’, serve as the opening essay of the volume. Here Feigl 
confesses and repents of the worst sins of recent empiricist philosophers against 
the history of science; namely, citing examples, e.g. of Newton or Einstein, 
without any regard for the actual facts of the case and, moreover, without even 
making a serious attempt to find out the facts. He is, however, far from renounc- 
ing the chief doctrine of logical empiricism concerning history, that is, that there 
is a fundamental distinction between the ‘historico-sociological’ development of 
science (‘discovery’) and the ‘logico-methodological’ reconstructions of scientific 
claims (‘justification’). Indeed, he thinks this distinction essential ‘if we are to 
retain even a minimum of clear thinking in these badly confused matters’ (p. 4). 
Thus Feigl explicitly rejects the ‘anarchistic’ position of Feyerabend. Moreover, 
he maintains that ‘the good historian of science must devote a great deal of 
attention to the meaning and justification of scientific knowledge claims’ (p. 4). 
On the other hand, Feigl insists on the Lakatosian paraphrase of the Kantian 
dictum for history and philosophy of science: History of science without phil- 
osophy of science would be ‘blind’, while philosophy of science without regard to 
history (i.e. analysis of specific cases in their cultural setting) would be ‘empty’ 
(p. 4). Now let us grant that philosophy of science without science would be 
empty. The question for one holding the ‘Kantian’ dictum is whether and how 
the historian of science, as Aistorian, has anything essential to contribute to the 
content of contemporary philosophy of science. So far as I can see, Feigl’s 
comments fail to answer this question. A 
One of the more provocative contributions is Arnold Thackray’s ‘Science: 
. Has Its Present Past a Future” Here Thackray laments the present lack of pro- 
fessional interest in historical issues relevant to contemporary social problems, 
most of which are, after all, connected with the post-war growth of science and 


e 
History and Philosophy of Science 287 


technology. He even attempts to trace this lack of interest to the social conditions 
of the immediate post-war period during which history of science emerged as a 
profession.t While not denying the value of the history of scientific ideas, 
Thackray strongly objects to its current dominance of the field and urges greater 
emphasis on such questions as the effects of secrecy and military sponsorship or of 
changes in the morale and status of scientists. In his equally spirited reply, 
Laurens Laudan argues that general historiographical debate is sterile, that 
history of science governed by considerations of relevance to current social pro- 
blems would be bad history, and that research into the history of scientific ideas 
is justifiable even if it has no relevance to contemporary social issues, Thackray’s 
response is that the interest and importance of an historical problem depends on a 
larger context. Within the framework of current internalist history, the suggested 
topics may seem unimportant or uninteresting; but these same questions may 
take on an entirely different light viewed in a different framework, and this is 
really what Thackray is urging.® 

Paul Feyerabend did not attend the Minnesota conference, but did submit a 
paper, ‘Philosophy of Science: A Subject with a Great Past’. By Feyerabendian 
standards this is a modest effort (only eleven pages!), the bulk of which is devoted 
to a brief examination of Mach’s philosophy of science.® According to Feyerabend, 
philosophy of science has been in a decline since Mach, a decline that began in 
Vienna in the 1920s. The reason for the decline, in brief, was the abandonment 
of Mach’s critical attitude toward both science and philosophy. The result is that 
people calling themselves philosophers of science devote much time and logical 
ingenuity to black ravens and grue emeralds, but little to real science. The cure 
for this dismal state of affairs is ‘to replace the beautiful but useless formal 
castles in the air by a detailed study of primary sources in the history of science’ 
(p. 183). Fortunately for all of us, the choice is not between Israel Scheffler and 
Imre Lakatos. There is plenty of room for maneuvering between these two 
extremes. More on this below. Here I will just note that like Feigl and others, 
Feyerabend fails to argue that it is history of science, as history, that is necessary 
to rejuvenate philosophy of science, and not simply closer attention to real live 
science. 

Mary Hesse’s ‘Hermeticism and Historiography: An Apology for the Internal 
History of Science’, is just that. Hesse argues that there is no way for the his- 
torian to distinguish internal from external factors, e.g. ideas about force from 
ideas about love or monarchy, unless he has a prior theory of rationality to deter- 
mine which ideas are rationally related and which are not. But, the argument 
continues, there is no theory of scientific rationality which is both strong enough 


1 The thesis, oversimplified, is that social history of science was too tainted by its associa- 
tion with Marxism to be respectable in the 19508. Koyré’s intellectualist approach 
provided a much more comfortable paradigm, especially for those whose backgrounds 
lay in the sciences rather than in history. 

® For a similar discussion of the relevance of philosophy of science to contemporary social 
issues see my [1971]. 

3 Incidentally, I found Feyerabend’s brief remarks about Mach more enticing than 
Hiebert’s sober and systematic treatment—that is, Feyerabend made me want to take 
another look at Mach. But perhaps this is only because my own interests are more 
phitosophical than historical. " 

1 Feyerabend cites Israel, Schefler’s Anatomy of Inquiry as a paradigm case of a useless 
enterprise. The reference to Lakatos, though not stated, is presumably Lakatos [1971]. 
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to make the necessary distinctions and also acceptable to the majority of phil- 
osophers or historians. She therefore concludes that no general defense of an 
autonomous internal history of science is possible. Nevertheless, she goes on to 
insist that particular historical judgments of irrelevance can be made, e.g. 
regarding the influence of the hermetic tradition on seventeenth-century science. 

Thackray, who represented the externalist position at the conference, would 
turn Hesse’s own arguments against her. Putting internalist history on a solid 
foundation, he insists, does require sound general philosophical principles to 
demarcate internal from external factors. Thus Hesse’s arguments against the 
existence of such principles are taken as arguments against the autonomy of 
internal history. This leaves the social historian of science much greater latitude. 
In particular, he is free to foresake philosophical analysis for modern empirical 
historical-sociological techniques. Now there are many philosophers of science, 
including several who attended the Minnesota conference (e.g. Adolf Griinbaum, 
Imre Lakatos, Wesley Salmon) who would agree with Thackray about the neces- 
sity for general philosophical principles of demarcation and disagree with Hesse’s 
view that there can be no such principles. These philosophers would thus offer a 
stronger defence of an autonomous internalist tradition than Hesse thinks pos- 
sible. Except for a few remarks by Feigl and Salmon, both of whom go out of 
their way to insist on the mutual relevance of history and philosophy of science, 
this viewpoint is missing from the volume. This is unfortunate because the latter 
is still the majority view among philosophers of science. 

By far the most systematic investigation of current relations between history 
and philosophy of science, in this volume or anywhere, is Ernan McMullin’s 
‘The History and Philosophy of Science: A Taxonomy’. McMullin begins by 
distinguishing two senses of ‘science’, one including the other: (S;) ‘a collection 
of propositions, ranging from reports of observations to the most abstract theories 
accounting for these observations’ (p. 15); (S4) ‘everything the scientist does that 
affects the scientific outcome in any way’ (p. 16). There are accordingly two types 
of history of science: HS, is basically a chronicle of theories and experiments; 
HAS, is an attempt to explain how a particular piece of science came to be.* 
Philosophy of science, however, is not dichotomised along quite the same line. 
McMiullin calls one type ‘external’ (PSE) ‘because its warrant is not drawn from 
an inspection of procedures actually followed by scientists’ (p. 24). ‘Internal’ 
philosophy of science (PSI), on the other hand, ‘relies for its warrant upon a 
careful “internal” description of how scientists actually proceed, or have in the 
past proceeded’ (p. 26). PSE is in turn characterised as either ‘metaphysical’ 
(PSM) or ‘logical’ (PSL) depending on the source of the external warrant. 
Plato and Descartes provide an example of PSM; Carnap of PSL. McMullin 


1 Tt should be mentioned that Imre Lakatos did read a version of his [1971] which offers an 
account of scientific rationality and explicitly applies this account to the problem of 
demarcating internal from external history of science. Unfortunately Lakatos withdrew 
his paper from the volume so that it might appear together with a reply by Thomas Kuhn 
in the proceedings of the 1970 Philosophy of Science Association meeting. (Cohen and 
Buck (eds.) [1977].) : 

3 McMullin cites E. T. Whittaker’s History of Theories of Aether and Electricity and L. 
Pearce William’s Michael Faraday as paradigms of HS, and HS, respectively. 

*® Having proceeded thus far in his taxonomy, McMullin pauses to consider several recent 
philosophical works that seem to rely essentially on history of science, particularly 

*Lakatos [1970] and Feyerabend [1969]. McMullin concludes that in Lakatos the role 


. 
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also classifies philosophy of science by subject matter: (i) epistemology, (i) 
ontology, and (ii) philosophy of nature. Since in principle one might pursue any 
problem internally or externally (and this in two ways), there are nine possible 
pure modes of philosophy of science, plus all possible mixtures. The main points 
of entry for history of science, according to McMullin, are via internal epistem- 
ology and ontology. Both points are worth considering. 

History of science is crucial for deciding the ontology of a theory, according to 
McMullin, because only the temporal development of a theory reveals its onto- 
logical commitments. Thus, for example, the Bohr model of the atom suggested 
both new phenomena, e.g. the Stark effect, and novel explanations for old data, 
e.g. the Pickering series of spectral lines. The ability of the model to anticipate 
effects not considered during its inception can only be explained, McMullin 
claims, by assuming that there is some ‘resonance’ between the model and real 
existing objects. Now while I am quite sympathetic with this line of argument for 
the reality of certain theoretical entities, I question whether it provides support 
for the relevance of history to the philosophy of science. First, it may be ques- 
tioned whether the temporal dimension, as such, is playing a crucial role in the 
argument. It is at least possible that the reason successful predictions (in the 
temporal and not logical sense) support the ontological claim is simply that valid 
testing of a theory requires that the theory not be designed to fit the data. Ex- 
plicitly predicted results insure that this requirement is met. This, anyway, was 
Peirce’s view of theory testing and it may be supported by some modern views of 
hypothesis testing as well.1 So the issue may not be one of temporal development 
as such, but of confirmation (though not in the sense of logical probabilities). 

Secondly, even granting that the temporal development of a theory reveals its 
ontological commitments, it does not follow that history of science, as history, is 
crucial, except in cases where the theory in question is one held in the past. 
Suppose, for example, that properly to assess the evidence in 1953 for the exist- 
ence and character of DNA one had to look at the development of that theory 
from 1945 to 1953. This would not require the special talents of a historian of 
science. To argue that any consideration of temporal development brings in 
history would commit one to arguing that dynamics is a historical science. 
Moreover, to argue, as McMullin appears to, that temporal development is not 
subject to logical and mathematical analysis would remove dynamics from 
physics. Surely this is giving the historian of science more than he seeks. 

The above point raises the general question concerning the essential role of 
history of science, as history, even for internal philosophy of science. McMullin 


assigned to history of science is equivocal, being ‘at once emphasized and called upon as 
evidence, yet systematically “reconstructed” in the service of a prior theory of ration- 
ality’ (p. 34). An answer to some of McMullin’s objections may be found in Lakatos 
[1971], the paper Lakatos read at the conference but did not publish in the volume. 
Feyerabend is found to be using history only to illustrate a prior notion of rationality 
which is not really based on history at all, despite appearances to the contrary. 

1 I have touched briefly on the role of a temporal gap between hypothesis formation and 

* data gathering for statistical hypotheses in Giere [1969]. The conclusion is that a tem- 
poral gap is sufficient but not necessary to insure satisfaction of a necessary condition for 
valid statistical inferences—at least on one common account of statistical inference. Thus- 
I would agree with McMullin in opposing standard philosophical accounts of con- 
firmation according to which temporal relations are necessarily irrelevant. 
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himself considers the possibility that PST might appeal only to contemporary 
scientific practice and not necessarily to the history of science. He rejects this 
possibility because: (1) history ‘provides complete case studies of a kind one 
could not recover from contemporary science’; and (2) history ‘allows one to 
study science in its all-important temporal dimension’ (p. 29). But surely we 
know (or can learn) more about the discovery of DNA than of bacteria, and 
surely the study of recent developments in science requires no peculiarly his- 
torical techniques—or at least not the techniques now taught by most historians 
of science. 

Turning finally to epistemology and the history of science, the crucial question, 
in McMullin’s own words, is ‘Can a PSI be normative?’ (p. 42). His answer 
seems to be that it cannot. If one grants that epistemology is normative, it follows 
that one cannot get an epistemology out of the history of science—unless one 
provides a philosophical account which explains how norms are based on facts. 
This ought to be a central problem for historically oriented philosophers of 
science, but few seem willing even to acknowledge the question, let alone attack 
it head on. 


SU MMING UP 


Except for a few contrary indications voiced by Feigl, Salmon, Stein and Koslow, 
the overall picture of recent philosophy of science that emerges from this volume 
is one of a discipline dominated by a concern with formal systems having little 
relevance to actual scientific theories or practices. History of science is seen as the 
means, or at least a means, of remedying the situation. 

Now it is certainly true that the growing dissatisfaction with the logical empiri- 
cist approach during the late 1950s and early 1960s coincided with criticism which 
appealed to the history of science. Works like N. R. Hanson’s Patterns of 
Discovery, Stephen Toulmin’s Foresight and Understanding and Thomas Kuhn’s 
The Structure of Scientific Revolutions come readily to mind. It is not obvious, 
however, that this criticism was effective (I do not say valid) because of its appeal to 
historical cases. Its effectiveness might be explained by the fact that it appealed to 
real science. That is was the science of Kepler and Darwin rather than that of 
R. P. Feynman and J. D. Watson may have been incidental. 

Contrary to the spirit of some remarks in this volume, e.g. by McMullin and 
Buchdabl, no logical empiricist ever thought that the form, content, or methods 
of science may be derived from formal logic alone. For traditional logical empiri- 
cists, the task of the philosopher was always the rational reconstruction and 
explication of theories, methods and meta-concepts found in actual scientific 
practice. Where logical empiricism failed was in the application and justification 
of this approach. The classic examples of reconstruction and explication reveal 
much about the preferred logical tools and empiricist epistemological pro- 
grammes of logical empiricism, but little about actual science. For example, 
traditional discussions of explanation and the interpretation of theories, usually 
carried on in the context of first order languages, have little relevance even to 
current philosophical discussions of the existence and nature of conventions in 


“1 Indeed, one of Feyerabend’s complaints in this volume is that logical empiricist, have 
been too slavish in following recent developments in science and thus have not played a 
- sufficiently critical role. $ 


a 
History and Philosophy of Science 291 


physical theory, e.g. in special relativity. Similarly, the vast literature on the 
justification of induction and the paradoxes of confirmation is hardly relevant to 
fundamental problems concerning hypothesis testing in psychology, let alone to 
the confirmation of hidden variable theories of micro-processes. Moreover, the 
theoretical account of the link between the facts of scientific practice and the 
normative conclusions of philosophical analysis was never very well developed. 
The requirement that the explication ‘resemble’ the explicandum was seldom 
examined and too often ignored. The distinction between scientific fact and phil- 
osophical convention, like that between synthetic and analytic statements, was 
applied uncritically. Even Carnap’s distinction between internal and external 
questions—the latter resolved by pragmatic decisions—left open as many 
problems as it answered. 

On the above account of the failing of logical empiricism, it does not follow 
that history of science, as such, promises any remedies, The most direct response 
would be to develop more flexible logical and mathematical tools, to pay closer 
attention to actual scientific theories, and to worry more about the nature of 
philosophical conclusions about science. At the moment there are many phil- 
osophers of science, particularly among the younger generation, who are follow- 
ing just this course. Moreover, this approach seems to be yielding dividends, 
especially in the philosophy of physics and in inductive logic. It is regrettable 
that many of the contributors to this volume, apparently unaware of recent 
trends among their more formalistically inclined colleagues, write as if the 
only alternative to historical studies is the mould set by Carnap in the nineteen- 
thirty’s and forty’s. 

Although there are substantial differences among them, those philosophers of 
science who make serious use of the history of science form a loosely connected 
school within the philosophy of science. It is natural that members of such a 
school should see their discipline in a different light from others. One would 
hope, however, the members of the school will not be content merely to practice 
their art but will make repeated efforts to explain and argue the rationale of their 
approach. I have already indicated a number of questions for the historical 
approach which seem to me to need further discussion. Let me emphasise one 
key set of questions by taking a further brief look at an issue that accidentally 
emerged as a central topic of the volume, the choice of theories. 

McMullin describes Kuhn’s conclusion ‘that changes of paradigm cannot be 
justified on empirical or rational grounds’ (p. 62) as a prime example of the impact 
of historical studies on logical and epistemological doctrines in the philosophy of 
science. Yet at key points in his argument for the non-rationality of paradigm 
choice, Kuhn appeals to the fact that there can be ‘no proof’ that one view is the 
true one. This is not a conclusion based on history but a logical point, one for 
which Hume is justly famous. The claim is, at bottom, that there can be no non- 
deductive reasoning in favour of a general theoretical framework. Would anyone 
argue that this claim follows from historical case studies? Moreover, when Kuhn 
claims that ‘mass’ means something different in classical and relativistic 
‘mechanics, thus rendering these theories incommensurable, he is not appealing 
to history but to an analytical criterion of difference in meaning. What historical 


1 See, for example, the following recent papers: Earman [1971], Fine [1972], Glymour 
[1971], Hooker [1972], van Fraassen [1970] and [1972], Winnie [1970]. 
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facts, by themselves, would show a difference in meaning here? A similar move 
occurs in Buchdahl’s paper. He considers and rejects the view that the main 
support for a theory must come from ‘empirical confirmation’ (p. 227). His 
rejection, however, is not based on an appeal to historical cases but on a philo- 
sophical rejection of the ‘old problem of induction’ (p. 228). So even in the re- 
jection of philosophical theses it is not so much historical cases as contrary 
philosophical theses that are operative.? 

The situation is even more problematic if one considers the possible positive 
bearing of historical studies on philosophical conclusions concerning the rational 
choice of theories. As I have indicated several times above, none of the contri- 
butors attempts, or even cites other attempts, to show in a systematic way just 
how philosophical theses about theory choice may be supported by historical 
case studies. T'o raise this issue is not necessarily to hold dogmatically to a dis- 
tinction between the descriptive and the normative. On the contrary, I would 
argue that all norms have their roots in facts. The general problem is to show 
that philosophical conclusions may be supported by historical facts and just how 
this comes about. Until this is done, the historical approach to philosophy of 
science is without a conceptually coherent programme. 

There is a further problem which is especially acute for any historical approach 
to criteria for the rational choice of theories. Suppose, as McMullin and others 
suggest, that history provides emptrical data for one’s account of theory choice. 
In this case the account of theory choice is itself an empirical conclusion, or, 
broadly speaking, a theory. But to choose a theory of theory choice on the basis 
of historical data one must already have some criteria for theory choice. Where 
are we to find these latter criteria? So not all our criteria for theory choice can be 
empirical conclusions supported by historical data. What then are they? And what 
is their relation to historical studies? 

Some will claim that these latter questions merely betray an old-fashioned 
justificationism. Perhaps. But those with more modern views should explain 
clearly why such worries are misguided and how it is that everything comes out 
all right if we ignore them. 


5 AN ALTERNATIVE VIEW 


In so far as my own views are fairly representative of the majority of philosophers 
of science outside the historical school, and in so far as such views are not well 
represented in this Minnesota volume, I will conclude with a brief sketch of an 
alternative analysis of the philosophy of science and of its implications for relations 
between history and philosophy of science. 

The most important divisions within philosophy of science should be by prob- 
lems, not methods. I would focus on three main problem areas: (1) The structure 


1 Kuhn’s appeal to the fact that no ‘proof’ can decide between comprehensive theories can 
be found, for example, in Kuhn [1962], pp. 147, 150-51. For the key remarks on the 
meaning of ‘mass’ in classical and relativistic physics see Kuhn [1962], pp. roo-1. 

3 The rejection of the ‘symmetry thesis’ regarding explanation and prediction and the 
‘deducibility requirement’ for the reduction of theories are commonly taken to be ext 
amples in which a philosophical thesis was refuted by historical cases, e.g. evolutionary 

- theory and statistical mechanics respectively. Yet even here the role of history of 
science, as such, seems minimal. These cases are important because they are part of the 
*currently accepted body of scientific knowledge. i 
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of knowledge, especially theoretical knowledge, (2) the validation of knowledge 
claims, and (3) the strategy and tactics of research (methodology, in a narrow 
sense). In each case one may be concerned with the problem as it applies to a 
particular theory, field, science, etc., or as it applies to all theories, fields, sciences, 
etc. It is his interest in generality that primarily distinguishes the philosopher of 
science from the theoretical scientist.1 Let us consider the possible bearing of 
history of science on each of these problem areas. 

Among questions concerning the structure of knowledge are questions about 
theoretical explanation, the meaning of theoretical terms, etc. Concerning 
questions about the structure of particular theories, e.g. quantum mechanics, 
philosophers will be guided by the same basic criteria as scientists, except in so 
far as they seek to follow patterns concerning the structure of theories in general. 
Do general theses about the structure of theories have to fit historical cases? I 
think not. If philosophy is to maintain its critical role, one may even refuse to 
accord the title ‘theory’ to something many scientists now call a theory. Indeed, 
all currently held theories may be judged somewhat defective, though not too 
much so else the claim to be talking about scientific theories becomes proble- 
matical. Philosophical theses cannot be completely a priori. 

. Now some members of the historical school seem to hold that a theory is an 
historical entity, or at least that one could not really understand a theory without 
knowing some temporal developments. Thus the structure of a theory cannot be 
captured by any single logical or mathematical structure. The difficulty with 
this claim, besides its vagueness, is that scientists learn theories from texts con- 
taining little history and what history there is would usually be judged bad 
history. So unless one is willing to claim that most scientists do not really under- 
stand the theories in their fields and could not learn them from standard 
texts, the claim that history necessarily enters consideration of the structure of 
theories must be rejected. There remains, of course, a question as to whether 
and to what extent theories may be set out explicitly in mathematical systems, 
but this problem may be treated separately from questions about the role of 
history of science. 

Questions concerning the strategy and tactics of research may be viewed in 
two quite different ways. They may be treated quite generally and used to help 
define rationality in the scientific context. This is how we should understand 
Peirce’s maxim, ‘Do not block the way of inquiry’, as well as more recent pro- 
nouncements by, say, Popper or Lakatos. That history of science as such plays an 
easential role in this enterprise, and does not serve merely as a convenient source 
of examples, is something that needs to be argued. Concern with the process of 
inquiry does not automatically make one an historian. The needed connection 
between process accounts of rationality and real science may be made solely 
within the context of contemporary science. 

The second way of viewing the strategy and tactics of research is as a straight 
forward, though second level, empirical inquiry. Now in science, as in any field, 
there is no substitute for genius, and of course no research strategy could 
guarantee success, e.g. in finding a satisfactory unified field theory. But if one 


1 J have sketched a similar taxonomy in Giere [1971]. “ 
2 This is the key point in, McMullin’s examination of Lakatos [1970] mentioned in foot- 
note 3, p. 288. : 
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examines the events leading up to particular discoveries, it at least seems that 
some strategies are better than others. And there are good reasons for thinking 
that this is true in general. Given a particular phenomenon of which some 
theories are true, and given the capabilities of human investigators, some actions 
are empirically more likely than others to lead to discovery of true theories. The 
only question is whether we can discover any useful patterns. This, however, is 
an empirical matter which can only be answered by trying. Moreover, the use- 
fulness of history of science in this enterprise is likewise only to be discovered by 
trying. So here is a place where history of science may have a direct input into 
the philosophy of science. The trouble is that not many philosophers of science 
regard the empirical study of research strategies as part of philosophy. Thus the 
point where the relevance of history may be greatest is just the point where the 
philosophical component seems weakest. Perhaps the best conclusion here is 
that both history and philosophy of science are contributing to a third area, say, 
the science of methodology. 

Turning to the problem of validation, I think a majority of philosophers of 
science would, contrary to Kuhn, Feyerabend, and others, agree that there is 
such a thing as empirical validation. That is, there is such a thing as non- 
deductive reasoning, though few agree about its specific form. The question here 
is whether history of science has any role in, so to speak, validating an account of 
empirical validation, whatever its specific character. I have already argued that 
history of science cannot be regarded as providing empirical evidence for a philo- 
sophical account of empirical validation. This view presupposes an account of 
validation. But our theory of validation cannot be purely a priori. What is left? 
The only way out, I think, is to admit that the choice of a theory of validation is 
ultimately not wholely a reasoned choice, but partly just the result of a causal 
process. Which causal process? In principle any process involving anyone who 
adopts a method of validation. In practice, however, the most relevant process 
is that involving philosophers, mathematicians and scientists who collectively 
formulate and apply various validational methods. For a method of validation 
to be accepted is for it to be thought correct and adopted in practice by an effec- 
tive majority of philosophers and scientists. A philosopher can only influence 
this process by criticizing, clarifying and developing methods of validation and 
then persuading scientists to try his recommendations. There is no guarantee, of 
course, that any single method will ever be effectively accepted. One can only try. 

The history of science can only indirectly enter into the process by which an 
account of validation becomes accepted. Philosophers and scientists may be 
influenced by their understanding of historical cases. But history of science need 
not enter the process, and it would be difficult to argue that it should. What we 
seek is a unified method of validation to be applied in current scientific inquiry. 
To argue that our understanding of past science, which is itself based on empirical 
evidence, should be fed in the process of choosing a theory of validation is to 
assume that we are right about the past and that this past experience is relevant 
to present scientific inquiry. Yet even historians would agree that relevance to 
present science decreases the further back we go. Moreover the period in which 

relevance to current scientific inquiry becomes problematic is just the period 


2 This approach to the choice of a method of validation, as well as an account of validation 
* itself, is discussed in Giere [1973]. 
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which has received the most attention from historians, namely the seventeenth 
century. Thus in so far as there is a role for history of science in resolving the 
problem of validation (including theory choice), it is only the history of recent 
science (say the nineteenth and twentieth centuries) that is at all important. 

It may be thought that the above account of validation is little different from 
Kuhn’s view that theory choice is a non-rational process. There is, however, an 
important difference. Here theory choice és rational—it is a matter of empirical 
validation. All the blood, sweat, toil and tears that go into the process of con- 
ceiving a new theory are, it is true, largely irrelevant to validation, but this 
process is hardly irrational and certainly subject to systematic empirical study. 
Moreover, it is assumed that the problems concerning incommensurability and 
the ‘theory ladenness’ of the observation language are resolvable. With these 
provisos, the intervention of non-rational processes is confined to the next level 
up—the choice of a method of validation. This is in accord with the idea that the 
real revolution in the seventeenth century was one of method rather than content. 

If history of science is somewhat relevant but not essential for the philosophy 
of science, what of the reverse? Here I agree with McMullin that the primary 
goal of the historian is to explain the occurrence of particular occurrences in 
science, e.g. why Newton did what he did, when and where he did it. To do this 
well requires no understanding of current philosophy of science. What it does 
require is an understanding of seventeenth century philosophical ideas about 
science. It also requires a great deal of other knowledge about the seventeenth 
century, from social conditions to theology. Everything is potentially relevant. 
The historian’s task is to weigh their relative importance in producing Newton 
and his physics. The dispute between internal and external history arises only if 
one approaches the question with the view that the character of Newton’s 
physics is determined solely by his ideas about space, time, matter, force, etc. 
Conversely, one might start out convinced that social conditions alone determine 
the outcome. I doubt that many historians of science fit either caricature, though 
many do exhibit some bias when it comes to weighing the relative importance of 
various kinds of factors. 

There is another way of approaching the history of science, though this is not 
the way it is usually done. This is to begin with current philosophical conceptions 
of the nature of theories, validation, etc., as well as current views of the scientific 
material, Then, in the case of Newton, for example, one could ask why he 
strongly believed something even though his evidence for it was very poor? 
Here an account of validation creates a clear distinction between ‘internal’ and 
‘external’ factors influencing belief. Similarly, one could fairly clearly separate 
retroductive arguments from inductive (to use Peirce’s distinction). Or again, 
given that such-and-such a physical condition is necessary for a certain con- 
clusion, on what grounds could Newton assume it even though he never acknow- 
ledges the assumption? The answers to such questions should be historically 
interesting—they would tell us something about Newton we could not learn 
simply by immersing ourselves in Newton’s own world. If anyone is to do this 
sdrt of history, however, it will probably have to be a philosopher. To an his- 
torian of science this approach will seem too much like that,of the proverbial 
retired scientist who goes through history ticking off the things that someone got - 
right. The analogy is unfair, of course. The point of such studies would not be 
to judge Newton, but to understand him. : 
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6 CONCLUSION 


This latest volume of the Minnesota Studies grew out of an attempt to investigate 
the rationale for the union of history of science with philosophy of science. At 
best the results of that investigation must be judged inconclusive. Advocates of 
the union have not produced good reasons for thinking that the union is particu- 
larly intimate. Indeed, viewed from a more orthodox position within the philo- 
sophy of science, there is every reason to think that the primary relationships for 
philosophy of science are with philosophy and science. Likewise, the primary 
relationships for history of science are with history and science. What they have 
in common is science. But this common interest is not a sufficient basis for other 
than a marriage of convenience. However, even if one agrees that the union 
between history and philosophy of science lacks a strong conceptual rationale, it 
does not follow that the marriage of convenience should be dissolved. The pro- 
liferation of centres, departments and programmes for history and/or philosophy 
of science during the past decade shows that neither historians nor philosophers 
of science are happy with their parent disciplines. In these circumstances a 
marriage of convenience may currently be the most practical institutional 
arrangement. Whether this arrangement will prove to be relatively permanent or 
only transitional remains to be seen. 


RONALD N. GIERE 


Department of History and Philosophy of Sctence, 
Indiana University 
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Reviews 


Rocer C. Buck and Rosert S. Conen (eds.) [1971]: PSA 1970, Boston Studies in 
the Philosophy of Science, 8. Dordrecht: D. Reidel Publishing Company. 
D.fl. 120 (US $38.40). Pp. Ixvi+615. 


Theeighth volume of the series of Boston Studies in the Philosophy of Science com- 
prises the proceedings of the 1970 biennial meeting of the Philosophy of Science 
Association. As a result of his death shortly before the meeting, the first session 
was devoted to recollections of Rudolf Carnap. These, together with other 
tributes, are printed at the beginning of the volume. The remainder is made up 
of three symposia and twenty nine contributed papers. The symposium on 
‘History of Science and its Rational Reconstructions’, led by Imre Lakatos and 
Thomas S. Kuhn, was already reviewed in a previous issue of this Journal (see 
Smart [1972]). 

The first symposium consists of James G. Greeno’s ‘Theoretical Entities tn 
Statistical Explanation’, with comments by Wesley C. Salmon and Richard C. 
Jeffrey. Greeno’s paper extends an earlier information-theoretic analysis of 
statistical explanation to cases where certain of the variables of a statistical theory 
are of theoretical character. By a statistical theory is meant a probability space 
with two or more distinguished partitions of the underlying domain (considered 
for simplicity to be finite). One of these, M, is understood to correspond to those 
observable variables whose values are to be explained, e.g. the presence or 
absence of fever and blotchy skin rash, whilst the other, denoted by S, corre- 
sponds to the joint values of the variables in terms of which the explanation is to 
be given, e.g. age, medical history, recent contact with people showing fever 
symptoms, etc. The theory specifies a probabilistic correlation between values of 
the variables associated with S and M. Two statistical theories based on the same 
probability space and the same explanandum partition M may nevertheless differ 
in their degrees of explanatory power if they employ different explanans parti- 
tions S and S’, e.g. S’ may be a refinement of S. Greeno proposes an explication 
of ‘degree of explanatory power’ as the information [(.S, M) transmitted by the 
theory: 

(*) KS, M) = H(S)+-H(M)—H(S x M), 

where H(S), H(M), H(S x M) are the entropies of the partitions S, M and the 
joint partition, respectively, relative to the probability space in question. In this 
paper Greeno is especially concerned with those statistical theories employing a 
further partition T of the underlying domain by means of theoretical variables 
for which P(m|st) = P(m]t) for all me M, se S, te T. Given the value of such a 
variable, e.g. virus infection, the probability of occurrence of given symptoms is 
independent in the above sense of age, medical history, etc. Only in this sense 
need the variables associated with the partition T be ‘theoretical’ relative to the 
theory in question. It follows for such theories that 


» KS, M) < KT, M) 
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and thus the (T, M)-subtheory whose explanans partition corresponds to the 
theoretical variables has at least as high a degree of explanatory power, in the 
proposed sense, as the phenomenological (S, M}subtheory. According to 
Greeno, ‘this fact is related to the intuitions that we have regarding the desir- 
ability of theoretical explanations when the dependencies between empirical 
variables are statistical’ (p. 7). It is admitted, however, that the introduction of 
theoretical variables need not change the empirical content of the theory. But it 
may serve the heuristic purpose of guiding the search for a new partition S”, e.g. 
corresponding to observable variables more strongly correlated with theoretical 
variables than those associated with S, such that [(S’, M) approaches I(T, M) 
more closely than does [(.S, M). On the other hand, in connection with theories 
describing systems whose statistical properties are time-dependent, Greeno 
states that ‘the use of theoretical entities can lead to a considerable simplification 
of the statistical structure of a theory in addition to their heuristic value’ (p. 15). 
(This simplification, however, is just the result of postulating the Markov 
property for the sequence of theoretical states of the system. Even in the non- 
stationary cases, therefore, it seems primarily to be the heuristic function of 
theoretical variables rather than their simplifying function that might be 
accounted for by the information-theoretic approach.) An interesting application 
of the approach is given to a theory of memorising, using information transmitted, 
summed over trials, as a measure of explanatory power. There is little indica- 
tion, however, why this should be a suitable measure. 

In his admirable comments Jeffrey notes that the measure (*) has the merit of 
assuming its minimum value when S and M are independent, i.e. P(sm|s) = 
P(m) for all me M, s € S, and of increasing with strength of correlation. He points 
out, however, that since the quantity (+) is symmetric in S and M, the proposed 
explication appears to disregard the distinction between explanans and explanan- 
dum. Furthermore the measure («) of explanatory power of the statistical theory 
relative to S and M is independent of its theoretical partition T: perhaps it 
should also depend on the entropy of T and of the remaining joint partitions. 
Jeffrey concludes by expressing doubts as to the possibility of evaluating an 
explication of ‘degree of explanatory power’ until we have some idea of what 
role it could play in a more general account of scientific methodology. There 
are also some interesting, largely explanatory, comments on Greeno’s paper by 
Salmon. 

The symposium on ‘Capacities and Natures’ is led by a paper by Milton Fisk. 
It is a very difficult task to form a clear conception of what is stated in this paper, 
partly on account of a style designed, it would appear, to avoid, as far as possible 
and at any cost, the use of technical terms, polysyllables and punctuation. Let 
the reader parse the following (p. 59): “To say the proposition corresponds to the 
nature of a is only to say that what the proposition says a is is what it is the nature 
of ato be’. Yet still he may remain ignorant what what it is the nature of some- 
thing to be is. Despite the sophistication of much of the argument, such critical 
terms as ‘explanation’, ‘modal conditional’ and ‘necessary connection’ are used 
as though they had a clear and unique sense already familiar to the reader. The 
paper attempts a fine distinction between entities, conditions, parts, components, 
.capacities and natures. However, it does not make clear to the present reader, nor 
perhaps to others whose intellectual sympathies lie more with the concepts and 
methods of modern science and mathematics than with those of the traditional 
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chia: what important problem in the philosophy of science wotild be 
solved, were the attempt successful. Ernan McMullin, in a sympathetic and lucid 
reply, traces the detailed relations between the views of Fisk and those of Aristotle 
and Locke. He concludes that modern physical science has found it unnecessary 
to postulate the natures advocated by Fisk. 

The remainder of the volume consists of twenty nine contributed papers, some 
better, some worse. I shall mention a selection of them in the order in which they 
occur in the volume. 

J. O. Wisdom’s ‘Observations as the Building Blocks of Science in 2oth-Century 
Scientific Thought’ is intended as a criticism of the doctrines of instrumentalism 
and conventionalism (under which latter operationalism is ‘easily subsumable’ 
according to the author). Unfortunately these doctrines at no time receive a more 
exact characterisation than the following: ‘According to instrumentalism... 
scientific theories are neither true nor false; they are only more efficient or less 
efficient instruments ... Conventionalism on the other hand allows of truth- 
value . . . such that the conventions assigning the meanings of the terms of a 
theory are more or less appropriate’ (pp. 212-13). Similarly, observations are 
said to be ‘theory-impregnated’ with no explanation how this metaphor should 
be understood. As a result, no more is achieved than one would expect from a 
discussion at this superficial level of analysis. 

Clark Glymour’s contribution ‘Theoretical Realism and Theoretical Equivalence’ 
is of an altogether different stature. Glymour’s aim in this is to argue against the 
thesis that there cannot exist two distinct theories between which no possible 
evidence could discriminate or, more exactly, the thesis that empirically equiva- 
lent theories are synonymous and therefore fully intertranslatable. By ‘empirically 
equivalent theories’ are meant theories with common classes of observational 
consequences. Glymour believes this thesis might be defended on the basis of the 
viewpoint (A), which he criticises, that a theory is true if and only if all its 
observational consequences are true. For then the ‘truth conditions’ for two 
empirically equivalent theories are the same and hence they are in one sense 
synonymous. Glymour objects to viewpoint (A) on the grounds that, for at least 
one interpretation of the novel semantics which it suggests, classical logic will be 
inadequate, e.g. any formula having only tautologies as observational conse- 
quences will be valid but not necessarily provable. He shows, furthermore, that the 
semantical consequence relation based on this interpretation of viewpoint (A) is 
not recursively enumerable and therefore no proof theory employing an effective 
notion of proof will be complete with respect to it. In addition to these difficulties, 
Glymour objects that Newton’s Laws, which by themselves have no non- 
tautological observational consequences, will be true according to (A) whereas 
we believe them to be false. (It might be questioned, however, on what grounds 
and in what sense we believe them to be false on this understanding of them.) 

The viewpoint (A) is attributed to Reichenbach on the basis of his statements 
concerning ‘equivalent descriptions’. It is doubtful, however, whether he would 
have subscribed to the semantics taken by Glymour to explicate this viewpoint 
or whether he would have taken his thesis that there can exist equally adequate 
descriptions of given phenomena based on different conventions to mean that the 
two sets of conventions are ‘synonymous’ or literally intertranslatable. His major : 
point in this appears to have been that the choice between them is not an empir- 
ical question. It seems that he would have fully endorsed Glymour’s final 
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conchision that ‘even all possible evidence can sometimes have more than one 
correct explanation’ (p. 285). 

Glymour next considers the case of two first-order theories understood to 
share a common observational vocabulary but to have no theoretical expressions 
in common, thus leaving it open whether some theoretical expressions of the two 
theories are nevertheless synonymous. He proposes first as a necessary condition 
for their theoretical equivalence, or full synonymity, that they should be inter- 
translatable in the sense of having a common definitional extension. This means 
that there should exist a set of explicit definitions Dg of the expressions of theory 
T in terms of those of theory Q, and a corresponding set of definitions Dy of 
expressions of Q in terms of those of T, such that the sets TUDr and QUDg are 
logically equivalent. (We remark that this is a strong condition of Kteral, i.e. 
term-by-term, intertranslatability.) Glymour, however, does not work directly 
with this condition but introduces a second, apparently claimed to be weaker 
than the first, namely that the two theories ‘have a model in common’. By this is 
meant that the theories have models whose diagrams have a common definitional 
extension in the sense given. (It seems not to go without proof, however, that this 
is really a weaker condition unless both theories are consistent and complete 
when the two conditions are equivalent.) Two theories are then exhibited which 
have no model in common and are therefore not theoretically equivalent in the 
proposed sense, but which are nevertheless empirically equivalent—the sole pre- 
dicate of the observation language being the identity predicate. Glymour also 
notes, without endorsing it, that both theories satisfy the desideratum of Kneale 
that a theory must be finitely axiomatisable, but its collection of observational 
consequences must not be finitely axiomatisable. Nevertheless the theories 
exhibited seem to have a purely mathematical character. It is pointed out, there- 
fore, that a theorem due to R. Robinson shows that elementary elliptic and 
Euclidean (or hyperbolic) geometry have no model in common. Yet according to 
convincing arguments of Reichenbach, any pair of geometries can be inter- 
preted so as to be empirically equivalent. (The notion of empirical equivalence, 
however, is now being given a sense other than the formal one it had up to this 
point.) Although the discussion is not entirely conclusive or complete, it can 
serve as an example to all who wish to treat these problems in a scientific spirit. 

John A. Winnie’s paper ‘Theoretical Analyticity’ is concerned with the problem 
of splitting up an empirical theory into a factual and conventional (analytic) 
component. Carnap already proposed three adequacy conditions for a solution 
but these are not generally sufficient to determine it uniquely. The solution fav- 
oured by Carnap was the weakest satisfying the conditions. The aim of this paper 
is to propose a further condition of adequacy such that only Carnap’s solution 
satisfies the extended set. According to the author, Quine’s objections to the 
analytic-synthetic distinction on the grounds of arbitrariness can only be met if 
this can be achieved. (This is open to question, however, on the ground that to 
show a solution to be non-arbitrary it is not necessary to show it to be unique but 
only that there are some reasons, relative to certain aims, problems, etc., for pre- 
ferring it to others. Relative to different aims, problems, etc., another solution 
might be preferable. According to this conception, the choice of solution eee 
. be conventional but need not be arbitrary.) 

The new condition of adequacy proposed employs the very interesting aoon 
of ‘observational vacuity’. A sentence S is said to be observationally vacuous in 
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theory T in case (i) S is a theorem of T and (i) every observation statement 
logically implied by S conjoined with any other theorem S’ of T is already 
logically implied by S’ alone. ‘Observationally vacuous sentences are those which 
function inessentially in any deduction of an observational consequence from the 
theory’ (p. 296). The new condition of adequacy states that the conventional 
component, and hence each analytic sentence, should be observationally vacuous 
in the theory in question. Otherwise ‘we are allowing sentences which are not 
observationally vacuous to be considered as analytic. But such sentences will 
function essentially in the deduction of some contingent observation statement 
from the theory, and could thus be reasonably said to be confirmed or dis- 
confirmed by the appropriate observational outcome’ (pp. 297-8). The major 
result of the paper is that a sentence is observationally vacuous if and only if it is 
analytic in Carnap’s sense. It follows that the solution to the extended set of 
conditions is uniquely Carnap’s. 

It is too early yet to reach a definite assessment of the cogency or efficacy of 
Winnie’s new condition. (The argument given in the paper on its behalf, which 
has already been quoted, is too brief to be conclusive.) The following remarks, 
however, may be to the point. First, since the paper deals with a first-order 
theory T, when the reader comes across the definition of observational vacuity 
together with the remarks quoted above, he may take the observation statements 
referred to to be likewise of first-order. However, this would be a mistake. The 
notion of logical implication, or consequence, involved must relate to the second- 
order extension of the language of 7. Winnie in fact uses the semantical conse- 
quence relation based on the standard models (in the sense of Henkin) of that 
language though, for the purpose of achieving his major result, it would suffice to 
employ the weaker relation for which there is a complete and effective notion of 
proof. But we cannot do without some such non-elementary notion. Lf all state- 
ments involved in the definition are restricted to first-order, it can be shown that 
there exist theories with elementarily observationally vacuous consequences, let 
us say, that are not analytic in the sense of Carnap. (The observation statement 
used in the proof of the major result is, in fact, the Ramsey sentence of the theory. 
It is unusual, however, to regard this as a test statement to be used in confirming 
or disconfirming the theory.) Secondly, the notion of observational vacuity is 
equally well-defined for infinite postulate sets, unlike that of analyticity in the 
sense of Carnap. There is, however, a natural generalisation of Carnap’s solution, 
applicable in all cases, which takes as analytic just those theorems of T which are 
true in every structure whose O-reduct is not the O-reduct of any model of T 
(this corresponds, in the finite case, to the requirement of being a logical con- 
sequence of the negation of the Ramsey sentence). It would be interesting to 
know whether Winnie’s major result also holds in the infinite case or whether the 
requirement that all analytic statements should be observationally vacuous is 
then not strong enough to ensure a unique solution to the problem in question. 
Lastly we remark that the discussion of the paper is restricted to the languages 
without identity. The major result, however, seems to hold equally for the lan- 
guages with identity, though some of the other theorems proved require certain 
adjustments. This paper should be a useful and stimulating source of ideas for 
anyone working in this circle of problems. ` 


The section on ‘Probability, Statistics and Acceptance’ includes papers by Jaakko 


Hintikka, Ben Rogers, ‘William W. Rozeboom and Alec C. Michalos, Hintikka’s 
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paper contains a useful discussion of de Finetti’s representation theorem and its 
philosophical significance. One of the conclusions reached is that ‘presupposing 
the exchangeability assumption, [de Finetti’s line of thought] implies that a bet 
on a general law is simply to make an infinite system of bets of a certain specifiable 
kind on singular statements’ (p. 339). Rogers’ paper, on the other hand, repeats 
some familiar material on the role of alternative hypotheses in the Neyman- 
Pearson theory of testing. Rozeboom’s ‘New Dimensions of Confirmation Theory IP 
is a technical study. One of its concerns is the intuition that whether all As are Bs 
is generally irrelevant to whether some particular thing is an A. A theorem 
proved shows that, however one construes the implication involved in ‘All As 
are Bs’, if relevance is measured as a confirmation ratio and confirmation satisfies 
the probability axioms, this intuition can only be satisfied at the expense of 
others equally strong. The paper raises many problems and should repay careful 
study by those interested in confirmation theory. 

In his paper ‘Cost-Benefit versus Expected Utility Acceptance Rules’ Michalos 
proposes a rule for the acceptance of scientific hypotheses which he calls ‘the 
principle of cost-benefit dominance’. Rules prescribing the maximisation of 
expected utility (of the sort proposed by I. Levi, for instance) come in for heavy 
criticism on the ground that, for their application, they require information that is 
very hard to come by, namely probabilities and epistemic utilities. There is also 
the problem of weighting the various desiderata. Michalos proposes instead the 
acceptance of the hypothesis whose benefits and costs of accepting are preferable 
to those of any alternative, for every possible contingency and for every desirable 
attribute. But in cases where this rule (in its pure form, at least) is applicable, t.e. 
cases of dominance, it is not clear that there is any conflict with the expected 
utility rules. In other cases Michalos mentions possible modifications in his rule 
of a sort corresponding to non-Bayesian solutions to the problem of decision in 
uncertainty, but they are not worked out in detail. He discusses Harvey’s defence 
of the circulation of the blood from the standpoint of the cost-benefit rule, but 
this illustration can hardly serve such a conclusive purpose as Michalos thinks. 
More modest claims in favour of his approach would have carried more con- 
viction. 

Paul Fitzgerald’s ‘Tachyons, Backwards Causation, and Freedom’ begins with a 
useful survey of the present state of tachyon theory. There is an interesting 
critical discussion of the effectiveness of the reinterpretation principle (negative- 
energy tachyons propagating backward in time interpreted as positive-energy 
tachyons propagating forward in time) in eliminating cases of so-called retro- 
causality. It is not always easy, however, for the reader to assess the force of the 
thought-experiments discussed, in the absence of a more detailed treatment of 
tachyon interaction with subluminal particles. Turning to the idea that tachyons 
may even involve a self-contradiction, Fitzgerald considers a device he calls a 
Logically Pernicious Self-Inhibitor. It consists of (f) a tachyon transmitter and 
receiver A which only transmits at a given time if it has not received a tachyon 
within the preceding five minutes of proper time, and (#) a tachyon reflector B 
which reflects all tachyons emitted from A to arrive back at A two minutes (in proper 
time) prior to their emission. Clearly no tachyon could be controlled in all the ways 

_ required for the operation of this device. Fitzgerald concludes that ‘it is empiric- 
ally unlikely that tachyons exist’ (p. 427) on the ground that, if they did, they 
would surely not be so uncontrollable. But this appears to have the same form as 
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the following argument. Consider a town whose only resident barber shaves'all and 
only those residents who fail to shave themselves. Clearly no barber could be 
controlled in all the ways required for the operation of this system. Hence it is 
empirically unlikely that barbers exist. The last section of the paper considers the 
implications of veridical tachyon messages from the future for questions of human 
freedom. The problem, however, has nothing peculiarly to do with tachyons. It is 
much the same as Aristotle’s problem of the sea-battle, except for its formulation 
in the epistemic mode. Fitzgerald takes perhaps longer than necessary to reach 
the conclusion that the fact that I don’t kill Ramses tomorrow, and know (now) 
that I don’t, does not mean that I cannot kill him tomorrow:—in the same way, 
one might add, as the fact that I didn’t kill him yesterday, and know (now) that I 
didn’t, does not mean that I could not have killed Ramses yesterday. 

The purpose of Warren D. Siemens’s ‘A Logical Empitricist Theory of Scientific 
Change?’ is to investigate what R. Suszko has called the ‘diachronic’ logic of 
science. With an acknowledged debt to several previous authors, e.g. to Suszko, 
Siemens characterises the state of a science at a given time by means of certain 
logical parameters and distinguishes, using logical concepts, between different 
sorts of transition in that system of states. The theory, which so far consists of 
thirteen definitions, is only intended to be classificatory (e.g. which transitions 
correspond to ‘progress’?): there are no transition probabilities. Some of the 
requirements Siemens imposes on further developments of his sketch seem 
unnecessarily strong. For instance, logical comparability of succeeding theories 
is demanded neither by his formalism nor by all writers belonging even to the 
early period of Logical Empiricism. (Already in the early 19303 Ajdukiewicz 
claimed that classical and special relativistic mechanics were mutually untrans- 
latable.1) The paper recalls, and to a certain extent elaborates, some interesting 
ideas, but it opens with several gratuitous and inaccurate historical observations. 
It is claimed that ‘according to... the orthodox Logical Empiricist view, the 
problem of developing a theory of scientific change was not considered to be a 
legitimate problem for the philosophy of science’ (p. 524). In defence of this claim 
only some equally unfounded remarks of D. Shapere are cited. Writers in the 
tradition of Logical Empiricism have always distinguished between historical 
problems in the analysis of science and logical problems. Although it was mainly 
the study of the latter that was practiced by members of the Vienna Circle and 
related groups in Berlin, Warsaw, the U.S.A. etc., the study of both was en- 
couraged. The historical development of science, in particular the discovery of 
the theory of relativity, stimulated both Reichenbach and Carnap. According to 
its author, Bridgman’s philosophy arose directly out of his concern with the rela- 
tion between classical and relativistic electromagnetic theory. An explicit treat- 
ment of logical relations between theories occurred in the analysis of crucial 
experiments (however oversimplified it may have been). The logical relation of 
reducibility was discussed in connection with the thesis of physicalism, this being 
an integral part of the original movement and not merely a reaction to criticism. 
Inter-theory relations, especially those between classical and relativistic con- 
ceptions of mass, were extensively discussed in Frank [1938]. Ajdukiewicz, 
especially, treated at length the problem of comparability of ‘scientific world- 
perspectives’ (see Ajdukiewicz [1935], and also the writings referred to there). . 


lFor a discussion, cf. Giedymin [1973]. 
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According to Ajdukiewicz, ‘the epistemologist should give his attention to the 
changes which occur in the conceptual apparatus of science and in the correspond- 
ing world-perspectives, and should seek to ascertain the motives which bring 
these changes about’. A study of logical relations between theories, using meta- 
mathematical concepts which only later became explicit, is not to be found in the 
early writings (though some of the discussions of Ajdukiewicz came close to it). 
But it is absurd to suggest that the sort of enterprise Siemens is promoting would 
be thought ‘illegitimate’ in Logical Empiricism. It is hard to see what serious 
intellectual purpose the creation and perpetuation of these myths about that 
tradition, and especially about its early days, can serve. 

In his paper ‘From Logical Systems to Conceptual Populations’ Stephen 
Toulmin contrasts problems of two types in the philosophy of science: problems 
concerning the ‘logicality’ of science and problems concerning its ‘rationality’. 
‘Logicality’ is concerned with ‘keeping our inferences neat and tidy, avoiding 
formal blunders, doing our sums aright’ (pp. 553-4). ‘Rationality’, on the other 
hand, has to do with ‘the ways in which new concepts are developed, in order to 
solve outstanding problems’ (p. 554). The former problems can be adequately 
dealt with by ‘the logician’s program for the philosophy of science’ (this apparently 
being the programme for axiomatizing scientific theories). However, it leaves many 
acute and important problems concerning the rationality of science unsolved. 
Toulmin conceives of a natural science not as a coherent logical system but as a 
‘conceptual aggregate or “population” ’. For an adequate analysis of the ration- 
ality of science we must study (¢) the non-formal relations between the concepts 
in the population and (#) the ways in which conceptual problems arise and are 
recognized as such. Briefly, we must study ‘conceptual populations and... the 
temporal sequences of conceptual variation and selective perpetuation by which 
these populations develop’ (p. 564): hardly a novel suggestion. (See, for instance, 
Ajdukiewicz’s just quoted remark, anno 1935.) What is original in Toulmin’s 
approach, is the claim that the methods of logic are emphatically not appropriate 
for this study. However, since the author appears to be familiar only with the 
elements of logical syntax (proof theory), this statement, though it may be true as 
far as his understanding of the problems in questions goes, carries little weight. 
As to what methods are appropriate, the paper gives no hint.t 

As a whole, the volume is an expensive mixture of good and bad. Those who 
want to buy the good half have also to pay for the bad. Another noticeable feature 
is the grouping together in some sections of quite unrelated papers, for instance 
of Edward Erwin’s surely frivolous paper “The Confirmation Machine’ with the 
logical studies of Glymour and Winnie. Another example is the association of 
Capek’s paper on Piaget with works on problems of modern physics by Fitz- 
gerald and Nagasaka. (Surely Giannoni’s. paper on Einstein and Lorentz would 
have been a better companion.) Hopefully these shortcomings will be over- 
come in future by publishing just the better half or by consigning the remain- 
der to a separate volume. 


P. M. WILLIAMS 
University of Sussex 


"1 This paper may well be a précis of Toulmin [1972], for a Spo discussion of which 
cf. L. J. Cohen [1973]. 
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Philosophy of science—strictly speaking of physics—in this historical intro- 
duction is taken to be ‘second order criteriology’, i.e. concerned with seeking 
answers to questions such as ‘(r) What characteristics distinguish scientific 
theories from other types of investigation? (2) What procedures should scientists 
follow in investigating nature. (3) What conditions must be satisfied for a 
scientific explanation to be correct? (4) What is the cognitive status of scientific 
laws and principles’ (p. 2). A brief account is then attempted of what some 
philosophers (e.g. Aristotle, Bacon, Whewell, Mill) have written about scientific 
method and what some philosophically influential physicists (e.g. Archimedes, 
Galileo, Newton, Poincaré) have done in their science that affected philosophy 
or what they have written about scientific method. There are four chapters deal- 
ing with ancient science and philosophy of science (Aristotle’s philosophy of 
science, Pythagorianism, the Euclidean ideal of deductive systematisation, 
atomism). Chapter 5 discusses the development of Aristotle’s philosophy in the 
Middle Ages (Roger Bacon, Duns Scotus, Ockham, Grosseteste, Nicolaus of 
Autrecourt), Chapter 6 presents ‘the debate over saving the appearances’ 
(Osiander, Copernicus, Bellarmino, Kepler); chapter 7 deals with seventeenth- 
century attacks on Aristotelian philosophy (Galileo, Bacon, Descartes); chapter 8 
is on Newton’s axiomatic method. In chapter 9, entitled ‘Analyses of the Implica- 
tions of the New Science for a Theory of Scientific Method’, the reader will 
find a variety of topics: in the first part, Locke’s, Leibniz’s, Hume’s and Kant’s 
views bearing on the ‘cognitive status of scientific Jaws’; in the second part, 
Herschel’s theory of scientific method, Whewell’s ‘conclusions about the history 
of the sciences’ and Meyerson’s ‘on the search for conservation laws’; in the third 
part, entitled “The Structure of Scientific Theories’—apart from a few lines on 
the relevance of the discovery of non-Euclidean geometries for the distinction 
between pure and physical geometries—we find ‘Duhem on the Binding of 
Laws’, ‘Campbell on Hypotheses and Dictionaries’, ‘Henfpel’s Criticism of 
Campbell’s Position on Analogies’, ‘Hesse on the Scientific use of Analogies’ and ` 
‘Harré on the Importance of Underlying Mechanism’. The contrast between 
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inductivism (represented by Mill) and the hypothetico-deductive view of science 
(Hempel-Oppenheim’s ‘deductive schema’, Nagel’s criteria of nomic status, 
‘Frank on simplicity) is presented in chapter 10. Chapter 11 is entitled ‘Mathe- 
matical Positivism and Conventionalism’ (Berkeley, Mach, Duhem, Poincaré, 
Hanson). Finally chapter 12 is devoted to ‘Twentieth Century Views on Demar- 
cation of Science’ (Bridgman, Carnap and some of his Viennese colleagues, 
Ayer and Popper). 

A historical introduction to the philosophy of science is badly needed. There 
are, of course, some anthologies of the classics of the philosophy of science, a few 
essays on individual philosophers and chapters in some books on the history of 
philosophy (e.g. Passmore’s popular A Hundred Years of Philosophy, available as 
a Pelican paperback). A systematic account of the development of fundamental 
concepts and doctrines in the philosophy of science has not been produced yet. 
John Losee’s book was, presumably, intended to fill this gap. I think that 
it has not quite succeeded in doing this. Not because it is elementary and 
sketchy. It was meant to be elementary and should be evaluated as such. As an 
introductory book it does have certain advantages: its language is clear, simple 
and mostly non-technical; most examples are easy to understand; it is short, 
especially considering the enormous time-span between the view reported in 
chapters 1 and 12; it does give a lot of information; it draws attention to some 
anticipations of modern ideas; it is, on the whole, correct and fair in reporting 
briefly the views of individual philosophers. Yet, to my mind, the book gives a 
rather distorted account of modern philosophy of science, partly due to a biased 
evaluation of the relative importance of the contributions made by various 
modern philosophers which resulted in an idiosyncratic selection and imbalanced 
emphasis; partly, perhaps due to its failure—intentional or not—to discuss 
interactions between the various trends in modern philosophy. In other words, 
even though Losee may be saying nothing but the truth in reporting the views of 
individual philosophers, he does not give an adequate account of modern 
philosophy, even within the limits he set for himself, by often not saying the whole 
truth. Let me add, that I realise how difficult it is to outline the history of the 
philosophy of science in two hundred odd pages and that the selection in any 
historian’s work is likely to provoke controversies and, finally, that one must not 
think that there is just one correct way of writing a history of the philosophy of 
science, introductory or otherwise. Having said all this I still think that Losee’s 
book goes much too far beyond the reasonable doubts one may have about the 
historical significance of the contributions various modern authors have made to 
the philosophy of science and takes too great liberties in disregarding some and 
overemphasising others; this is particularly inappropriate in an introductory 
book. My remaining comments will be critical and will be mainly concerned with 
the last four chapters of the book. I shall begin, however, at the beginning and 
follow, more or less, the order in which topics are discussed in the book: 

(1) In the Introduction, Losee sketches four views on the subject-matter of 
the philosophy of science. Then, with acknowledgement to Gilbert Ryle, he dis- 
misses the view that ‘the philosophy of science is a discipline in which concepts 
and theories of the sciences are analyzed and clarified’ as pretentious and as con- 

.fusing the job of a philosopher with the job of a scientist——To my mind, neither 
of the charges is really serious; moreover, Losee’s own view on the subject- 
matter of the philosophy of science is clearly open to both: if it is desirable for a 
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philosopher of science to be modest, what could be more pretentious than seeking 
answers to questions such as ‘what procedures should scientists follow in investi- 
gating nature?’ But this is one of the questions on the list of Losee’s preferred- 
problems for the philosophy of science; besides, Losee explicitly admits: 
‘Recognition that the boundary-line between science and philosophy of science is 
not sharp is reflected in the choice of subject-matter for this historical survey’ 
(P. 4). 
(2) Quite rightly an attempt is made in the book to trace the development of 
the idea of the elimination of alternative hypotheses by falsification (‘the method 
of falsification’). Robert Grosseteste’s contribution to this development is dis- 
cussed in chapter 5 and Bacon’s in chapter 7. Then, we read: ‘Of course, Francis 
Bacon did not invent this method of falsification. Aristotle had employed it, and 
Grosseteste and Roger Bacon had recommended this method as a standard way 
to establish a hypothesis by eliminating competing hypotheses’ (p. 66).—It seems 
to me that a reference should have been made in the book to Ptolemy’s lucid and 
beautiful account of the method of elimination of alternatives which do not 
explain available data, given in Almagest I, 3: ‘That the Heaven Rotates as a 
Sphere’ (in Drabkin and Cohen’s Source Book in Greek Science, p. 123). Again, 
all these anticipations notwithstanding, a balanced survey of modern philosophy 
of science should link Popper’s name with the idea that scientific method consists 
chiefly in the method of falsification; but all that Losee has done was to discuss 
‘falsifiability’ as Popper’s criterion of demarcation. This, by the way, helps him 
overlook certain deficiencies in Duhem’s criticism of ‘crucial experiments’, to 
which I will return in a while. Finally, there is no reference in the book to the role 
in science of ‘weak’ falsification or elimination of alternative hypotheses with the 
help of statistical inference (the testing of statistical hypotheses); sadly, there is 
no mention of statistical inference in the book altogether and the terms ‘statistics’, 
‘probability’, ‘probabilistic (statistical) law’ do not appear in the index of subjects. 
(3) Following the now well established tradition, Losee attributes (pp. 170-2) 
to Duhem the priority of seeing and describing correctly ‘the logic of discon- 
firmation’, namely the fact that since an observational prediction is usually 
deducible from a conjunction of premisses, failure to observe the predicted 
phenomenon does not unambiguously falsify any single hypothesis but only the 
conjunction of the premisses. On the next two pages Losee reports Poincaré’s 
conventionalist view of the laws of mechanics and of geometry; part of Poincaré’s 
conventionalism is exactly the claim that no unambiguous falsification of either 
the laws of mechanics or of geometry is possible, since any crucial experiment 
designed to decide, e.g. between Lobatchewskian and Riemannian geometries 
would have to rely on some laws of physics (optics) and, therefore, in the case of 
a negative outcome of our experiment we could either give up one of the two 
competing geometries or modify the laws of optics (cf. pp. 72-3 and also pp. 
151-2 of the English Dover edition of Science and Hypothesis). Now this is 
exactly the same point about ‘the logic of disconfirmation’ as Duhem’s. While, 
naturally a discussion of the controversial issue of priorities (Poincaré’s Science 
and Hypothesis appeared in 1902, Duhem’s La Théorie Physique in 1906, but it is 
known that both were written before 1900) would have been out of place in an 
introductory book, I think that a comment on the relevance of the ‘logic of dis- 
confirmation’ for the famous problem of the choice of a geometry, would have ` 


been desirable. : 
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As-regards Duhem’s criticism of the idea of crucial experiments and of the 
claim that there are crucial experiments in science, Losee does not say (consis- 
- tently with his policy to report rather than assess) that the criticism—though 
understandable and plausible against the background of certain exaggerated 
views expressed by both some scientists and some philosophers—is nevertheless 
defective on, at least, two counts: (a) crucial experiments can and, in fact, are 
often construed as being ‘decisive’ or ‘conclusive’ relative to a fixed list of alterna- 
tive hypotheses; on this construal, they are valid deductive arguments and, there- 
fore, ‘conclusive’ in the way all valid deductive arguments are: they ensure the 
truth of the conclusion, provided that all premisses are true; now, one of the 
premisses is the list of alternative hypotheses, formulated as a disjunction of those 
hypotheses: naturally, if that disjunction is not true—and it is always hypothetical, 
except in the marginal case of ‘H or not-H’—there is no guarantee that the con- 
clusion will be true. Since, from the empiricist viewpoint at least, no empirical 
hypothesis can be proved anyway in any sense stronger than ‘confirmed’, 
‘corroborated’ etc., there is no reason why one should not be interested in hypo- 
thetical crucial experiments which certainly do exist (this was roughly the gist of 
Popper’s comments on Duhem’s criticism). Similar comments apply, mutatis 
mutandis, to the choice of statistical hypotheses on the basis of a suitable testing- 
procedure: the choice may be rational relative to given assumptions (or the testing 
‘model’ as some statisticians call it); (6) one may construe some outcomes of 
experiments as crucial in the historical or pragmatic sense: together with other 
outcomes and considerations they have ‘tipped the scales’ i.e. persuaded the 
scientists of the day to reject one and to retain another of alternative hypotheses 
under consideration; again, there are crucial experiments in this pragmatic sense 
(this is similar to the suggestion made by Lakatos that crucial experiments are 
seen as such only ex post facto). An analysis along the above lines would have 
helped readers of this historical introduction to see Duhem’s criticism and the 
idea of crucial experiments (as well as ‘the method of falsification’) in a better 
perspective. This would be even more so, had Losee noticed and reported the 
first formulation of the ‘incommensurability thesis’ by LeRoy, its criticism by 
Poincaré (The Value of Science) and reformulation and defence by Ajdukiewicz 
in Erkenntnis ([1934]) and ([1935]). For it is this line of conventionalism that 
anticipated some of the more recent criticisms of the idea of ‘crucial experiments’ 
and logical or empirical comparability of scientific theories. There is no trace in 
the book of these controversies over the limitations of empiricist rationality. 

(4) Losee reports (pp. 172-4), I think correctly, Poincaré’s views on the pos- 
sibility of a double (or, strictly speaking, triple) interpretation of the laws of 
mechanics and of other scientific laws: they may be understood either as empirical 
laws or as conventions or else as performing both functions at the same time. 
Losee concludes rightly that ‘...it would be incorrect to attribute to Poincaré 
the view that general scientific laws are nothing but conventions which define 
fundamental scientific concepts’ (p. 173). However, at least three important 
points are missing from this account of Poincaré’s view of scientific theories: 
firstly, that Poincaré suggested to split propositions (e.g. Newton’s laws) into 
empirical and conventional components (cf. The Value of Science, p. 124), thus 

_ anticipating vaguely an idea of Carnap’s; secondly, that Poincaré attached a 
rather great importance to the distinction between empirical laws (inductive 
- generalisations) and abstract, theoretical laws, the latter ‘of which he regarded as 
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being metaphorical and, therefore, neither true nor false; thirdly, that Poincaré 
explicitly discussed how some laws undergo a change in their status, since 
scientists ‘... by an unconscious nominalism ... have elevated above the laws - 
what they call principles (conventions) . . .’. Had all these points been discussed 
adequately, then it would strike the reader as odd that (pp. 175-6) Losee should 
report as ‘an interesting recent discussion of the cognitive status of scientific laws’ 
Hanson’s ‘discovery’, stimulated by Wittgenstein’s philosophy, that the cognitive 
status of a law depends on how we use the law or on our attitude to the law in 
question. Is this not what Poincaré quite clearly said, what Popper re-affirmed in 
his Logic of Scientific Discovery (‘Only with reference to the method applied to a 
theoretical system is it at all possible to ask whether we are dealing with a con- 
ventionalist or an empirical theory,’ p. 82), and what Losee reports (p. 189) Ayer 
as saying? Since Losee disregards entirely all but three philosophers who began 
their careers after the nineteen-forties, one would expect him to report exclu- 
sively original, new developments, unless he thinks that there have been none. 
By the way (p. 190), the term ‘tautology’ is used in the old-fashioned way, to 
cover all sentences employed to introduce new terms, including ‘partial defini- 
tions’. 
(5) The relation between the history of science and the philosophy of science 
has been the subject of many disputes lately. In the book two philosophers are 
mentioned as those who attempted to base their philosophy on extensive studies 
of the history of science: Whewell and Duhem. On page 120 we read that accord- 
ing to Whewell ‘previous writers on the philosophy of science had regarded the 
history of science as a mere storehouse of examples... to illustrate particular 
points about scientific method . . .”. Whewell proposed to invert this relationship 
which had made the history of science dependent on the philosophy of science. 
In the very next paragraph we learn that Whewell was quite sophisticated about 
the methodology of historical research knowing that it necessarily involved ‘acts 
of synthesis’. Accordingly, he selected certain interpretative categories to guide 
his historical studies, he ‘....saw scientific progress as a successful union of 
facts and ideas’.—Now, it seems to me that what Whewell did was simply to 
comment on the history of science from the point of view of Kantian philosophy. 
Far from inverting the relationship between the history and the philosophy of 
science, he followed the most traditional path. If some of his ideas appeal more to 
some contemporary philosophers than the views of, for example, Mill, this is not 
because they are somehow based on careful and extensive historical research but 
simply because there has been a revival of Kantian philosophy in the last ten 
years or so. Again, Losee’s report of Duhem’s views (necessarily sketchy but 
correct, I think) does not convince me at all that Duhem’s philosophy of science 
was any better on account of his important contributions to the history of science 
than, for example, the philosophy of his contemporary, Poincaré, who had no 
time for historical research; needless to say, Poincaré knew—if I am permitted 
this understatement—modern mathematics and physics. Naturally, I would not 
want to make inductive generalisations from these two cases one way or another. 
(6) According to the Preface, the emphasis of the book is on the developments 
prior to nineteen forty. Yet the reader will find no trace of the names, let alone of 
the views, of such classics of the philosophy of science as Comte, Peirce, Russell, 
Keynes or Ramsey, not to mention many other leading philosophers of the Anglo- 
Saxon world or more eXotic Continental Europeans. The reader will not have . 
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from‘the book arly idea how influential in nineteenth-century European phil- 
osophy was ‘the first positivism’ of Comte which attempted to reduce science to 
* collecting and systematising ‘objective facts’ and philosophy to classifying 
sciences. He will not know that the conventionalism of Poincaré and Duhem 
(preceded and strengthened by pioneering philosophical insights of Ampère, 
Cournot and Bernard) was partly a reaction against Comte’s positivism and partly 
against the Kantian interpretation of mathematics and of physics in terms of 
‘synthetic a priori truths’, although conventionalism, in turn, absorbed certain 
ideas from positivism and from Kantian philosophy. Since there is no mention 
in the book how influential in nineteenth-century philosophy of mathematics and 
of science was Kant’s philosophy, the reader will not understand either the 
role the discovery of non-Euclidean geometries played—trightly or wrongly 
—in undermining the confidence in Kantian philosophy, or the attempts to 
reconcile some elements of Kantianism with newer trends such as positivism and 
conventionalism (cf. Poincaré’s belief in the synthetic a priori nature of the 
principle of mathematical induction and his rejection of the Kantian interpreta- 
tion of physics and geometry; also Vaihinger’s ‘fictionalism’). Again, the reader 
will not learn from Losee’s book that the dominant philosophy of science in the 
interwar period, t.e. prior to 1940, was Logical Empiricism, known also as ‘the 
third positivism’ (the second being that of Mach, Avenerius etc.), which attempted 
to reduce philosophy to philosophy of science, and the latter to the logic of science, 
based on the ‘new logic’ of Bertrand Russell, on Carnap’s logical syntax, T'arski’s 
logical semantics, and on probability theory; none of these terms, essential to 
understand the philosophy of science of Logical Empiricism, occurs in the book. 
The reader will learn more about the philosophy of science of Logical Empiricism 
from Passmore’s book, in spite of its sarcastic tone. 

Carnap’s, Reichenbach’s and Popper’s contributions to the philosophy of 
science have been disposed of in the chapter on the ‘Twentieth Century Views on 
the Demarcation of Science’. There is no mention at all of such influential con- 
tributions as Carnap’s theory of logical probability (confirmation), Popper’s 
criticism of ‘induction’ and of ‘inductivism’, Reichenbach’s philosophy of space 
and time based on his analysis of Relativity and Quantum Physics, Nagel’s 
account of the reduction of theories! Frank is mentioned in connection with his 
views on the role of simplicity of ‘operational definitions’ in making theories 
empirical (anyway, largely anticipated by Poincaré), but there is no account of his 
interesting study (in Foundations of Physics) of the changes in scientific theories 
and of inter-theory relations, so relevant to more recent disputes in the philosophy 
of science. From the curious design of the last four chapters of the book one may 
have the impression that the main—if not the only—contributions to the analysis 
of the structure of scientific theories have been made by Duhem, Campbell, 
Hempel-Oppenheim, Hesse and Harré. It is difficult to see why chapters 10 and 
11 have been separated from chapter 9, since chapters ro and 11 also deal with 
the structure of theories and with the implications of classical physics for the 
philosophy of science. The opening paragraphs of “The Verifiability Criterion’ in 
chapter 12 (p. 184) is misleading. Losee contrasts there Bridgman’s concern as.‘a 
practising physicist interested in methodological problems within his discipline’ 
_ with the Vienna Circle ‘a group of philosophers under the leadership of Moritz 

Schlick’ whose concern was ‘to purge philosophy of all metaphysics’. Now, 
- Bridgman’s operationalism did not arise from his investigations as a practising 


Reviews 313 


physicist of the properties of matter under high pressures: it resulted from his 
work on his lectures on electrodynamics and relativity. Mach’s and Einstein’s 
criticisms of the Newtonian ‘absolute’ concepts were also the inspiration for’ 
Schlick, Frank, Carnap and Reichenbach in their attempts at defining the 
criteria of empirical meaningfulness and for Popper in his attempt at finding a 
satisfactory criterion of demarcation. It may also be worth noting that Heyting’s 
Intuitionist Foundations of Mathematics appeared in 1931 in Erkenntnis and 
that the intuitionist criterion of meaningfulness was designed to purge mathe- 
matics, not philosophy, of meaningless concepts and inadmissible operations. 
If one of the aims of a satisfactory historical introduction to the philosophy of 
science is to help readers understand what is going on in contemporary philosophy 
of science as a result of previous developments, then Losee’s book has serious 
shortcomings in this respect. It will not help readers understand, for example, 
why so many recent contributors to the philosophy of science bothered to 
criticise ‘the logical approach’, ‘the view of scientific theories as interpreted 
systems’, ‘the empiricist view of science’, ‘the positivist philosophy’, ‘probabilistic 
inductivism’, etc. To be sure, as long as we rely exclusively on Losee’s modest 
Introduction we are not going to be aware of the existence of such critics and 
criticisms. Of all contemporary philosophers of science, not known before 1940, 
the names of only three (Hanson, Hesse, Harré) occur in the book and some of 
their views are discussed. One of those three (Harré) is mentioned as ‘a vigorous 
critic of deductivist and positivist philosophies of science’ (p. 132), but even this 
phrase itself cannot be comprehensible to readers on the basis of the book 
(does ‘Deductivist’ refer to the views discussed in chapter three, concerned with 
the Ancients, or in chapter ro, concerned with Hempel and Oppenheim, Nagel 
and Frank, or to both, or to neither?; ‘positivist philosophy’ does not appear in 
the book, although the ‘mathematical positivism’ of Berkeley does). Since the 
book is small and its emphasis is on the developments prior to 1940, we must not 
expect too much so far as recent philosophy is concerned. Still, at least a short 
postscript with an outline of later developments would have been helpful. 


JERZY GIEDYMIN 
University of Sussex 


Naess, A. [1972]: The Pluralist and Possibilist Aspect of the Scientific Enterprise. 
London: George Allen and Unwin. £4.00. Pp. 148. 


The most important pluralistic aspect of the scientific enterprise, according to 
Professor Naess, is that described by what he calls ‘the Mach-Duhem-Poincaré 
principle’: ‘Given a set of observation sentences... there are indefinitely many 
. . . mutually incompatible, sets of theoretical sentences such that the given set of 
observation sentences can be derived from them’ (p. 43). Much of the book con- 
sists of assorted applications or implications of this principle, most of which I 
found unenlightening. The author alludes to (but does not analyse) the many- 
one relationship which exists between the precise formulations of a theory and 
the central theoretical idea lying behind them; e.g. there are an ‘indefinite . 
plurality of versions of Newton’s dynamics’ (p. 67), each with a different cogni- 
tive content (p. 66). (He refers to the presentations of d’Alembert, Lagrange; 
Y 
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Hamilton, and Jacobi.) Naess also follows Hanson in claiming that each precise 
formulation of a theory, such as ‘F = ma’, has a multiplicity of functions and 
“uses (p. 83). 

But the reader will wonder why it is important to recognize these various 
pluralities. Suppose it is true that there are an indefinite number of versions of 
Newton’s laws. How does this claim increase our understanding of the history of 
mechanics? How does it help us to understand what is inadequate with other 
philosophical accounts of science? Perhaps the explanatory power of the pluralist 
view is obvious, but I, for one, need to have the consequences spelled out. 

One suspects that Naess is not really interested in pluralism in science at all, 
except for its usefulness as a model for philosophy of science and linguistics. 
The real purpose of the discussion of Newtonian physics is to arrive at the follow- 
ing conclusion (p. 83): 


If“F = ma” has at least five main uses, a sentence such as‘ = ma” has at 
least five main uses’ should not be expected to have less... .’ The valuable 
insistence on plurality of uses must not be limited to physical research, It is 
as pertinent in lexicographical and empirical semantical research. 


Naess devotes much time and effort to a survey of the various things which 
scientists and philosophers may mean when they use meta-scientific terms, such 
as ‘theory’ (Chapter ITT) and ‘refutation’ (Chapter IT). The reader will judge for 
himself the value of such lexical catalogues. 

Naess does make one application of the ‘Mach-Duhem-Poincaré principle’ to 
philosophy of science which is rather interesting, vig., his criticism of Feyera- 
bend. Although Feyerabend recognizes the importance of theoretical alternatives 
in science, Naess claims that he seems to think that one can settle on a unique 
philosophy of science by paying closer attention to the history of science. As 
Naess puts it (p. 133): 


. .. Feyerabend seems to see himself as the old Nordic god Thor, using ‘the 
hammer of history’ to smash false images of science in thunder and lightn- 
ing. His source of authority: ‘actual scientific practice’. 


Given these various pluralistic aspects of science, language, and philosophy, 
how is one to proceed? Naess thinks that the main job of the philosopher is to 
clarify and catalogue the variants of theory, usage, and world-view known so far, 
although, as we shall see at the end, there are even serious limitations on this 
activity. Some readers may remember that Naess, in 1938, conducted a world- 
wide survey on whether people agree or disagree with Tarski’s correspondence 
theory of Truth. But there is another, more dramatic moral to be drawn—which 
brings us to the possibilist aspect of science, as Naess sees it. 

Since no theory is ever refuted (because there are a plurality of ways in which 
we can formulate the theory, interpret experimental results, conceive of falsi- 
fication, theorise about observation, etc.), then nothing is ever ruled out and 
anything is possible—perpetual motion machines, things travelling faster than 
light, a horse winning the Derby by running backwards (p. 79). Naess thinks 
that in some contexts (such as an everyday setting) one would be wrong to think 

-it possible that a horse will win the Derby running backwards. But there are a 


1 Naess [1938]. Fi 
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plurality of contexts and in a biological discussion such a strange feat of loco- 
motion may well be “declared” possible (p. 79). : 

The general point is not new and Naess cites Quine and Lakatos as holding 
similar views. However, the author does not present any new arguments for his 
position, nor does he try to argue that his version is better than other fallibilist 
or pluralist doctrines. (He does point out that his slogan, ‘Anything is possible’ 
does not have the hedonistic, permissive overtones of Feyerabend’s ‘Anything 
goes’ (p. 49).) Neither does he tackle the difficult and crucial question of how we 
might appraise alternative theories, each of which is possible, but some of which 
presumably are better than others. 

As an attempt to provide a detailed, coherent account of a philosophical 
position, this book is a failure. Nevertheless, one might find it of heuristic value. 
There are many isolated paragraphs which contain the seeds of ideas which might 
turn out to be interesting. For example, Naess suggests that the historian should 
not just echo or report past happenings and opinions but should also ‘take a 
responsible stand within the scientific debates of the past’ (p. 13). (He is parti- 
cularly critical of the new breed of oral historians who passively record the utter- 
ances of physicists telling their own story.) 

Here are other potentially interesting positions which Naess hints at, but does 
not develop: 


1. All sciences are testable fragments of fairly well-worked out world-views 
(p. 11). (Examples, please! The articulation of which world view leads to 
meteorology?) 

2. Experiments are usually viewed as being more conclusive or more crucial 
at the time than they are later (p. 18). (Lakatos claims the reverse, i.e., that it is 
only in retrospect that an experiment is labelled as a refutation. Naess may be 
right, but he does not discuss Lakatos’s view; neither does he provide any 
historical examples of his own view.) 

3. The life-expectancy of a theory depends largely on the rate of proliferation 
of promising alternatives (p. 21). (Does it also depend on how well the theory 
explains the available data? Does the rate of proliferation depend at all on the 
success of the original theory? One wishes Naess would have replied to Kuhn 
and Lakatos on this point.) 

4. A measure of the ad-hocness of a Duhemian strategy (viz. blame the auxil- 
iaries not the central theory) is the relative strength of the disjunction of the auxil- 
iaries (p. 24). (Relative strength is defined in terms of the number of variables in 
the formula and the number of ‘falses’ in the truth table (p. 54). The definition 
works, at least for simple cases.) 

5. Given its monolithic, justificationist image, science is more of a threat to 
free society than the pluralty of pseudo-sciences. ‘Small scale irrationality is of 
little importance compared to the global irrationality created by dominant 
“scientific” world views fostered by “genuine” science’ (p. 40). (A bold claim— 
one would like to hear Naess’s arguments for it.) 

6. Scientists judge theories not just in terms of their truth or falsity but 
according to at least ten criteria. (Naess’ own list isn’t very interesting—one of 
them is ‘. . . the extent to which we can assume cooperational efficiency to have, 
been due to other factors than positive transference among members of an in- 
group’ (p. 36}—but He is probably right in saying that the scientific enterprise 
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cannot be understood simply as the search for truth, or knowledge, or even 
explanation.) 

7. Scientific theories should not be construed as ‘mirrors of reality’ but as 
‘ideal-typical constructs’ (p. 62). ‘The relation between system and reality is not 
one of mirroring or copying, not even one of structural isomorphism. Therefore, 
two mutually inconsistent systems may both correspond to reality’ (p. 131). (How 
are we to make sense of this idea?) 

There are also intriguing passing references to German hermeneutic philoso- 
phers, such as Adorno, Habermas and Horkheimer, but Naess does not discuss 
their views in any detail. 

Whatever the shortcomings of his work one must admire the author for his 
existential authenticity. In the last three pages of the book, he clearly outlines the 
gist of his pluralist-possibilist philosophy (pp. 131-3): 


1. All consistent comprehensive points of view have a non-zero status of 
validity... . 

2. Reality is one... 

3. Two mutually inconsistent systems may both correspond to reality... 

4. A comprehensive and consistent point of view is not something a 
person... can have [in the sense of choose; rather such views are like a part 
of his personality]... 

5. Points 1 to 4 are not capable of being made precise beyond certain 
(ill-defined, modest) limits of definiteness. 


Many philosophers have been led to the conclusion that philosophy is futile, 
but few have taken their own arguments seriously enough to act on them. On the 
book jacket we read that Arne Naess has resigned from his Chair of Philosophy 
at Oslo in order to devote himself more fully to the urgent environmental prob- 
lems facing man. 


NORETTA KOERTGE 
Indiana University 
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Probability atid Quantum Mechanics 


by C. F. VON. WEIZSACKER 





I Pronar Rah ‘ 
(a) Probability and Experience. 
.(6) The Probability Axioms of Quantum Mechanics. 
(c) The Universality of Q Quantum Theory. 

2 The Classical Concept of Probability. 

3 Probability in Quantum Theory. 


I INTRODUCTORY REMARKS. 
This paper points out a connection between the basic assumptions of the 


theory of probability and those of quantum theory. In order to do this I 
shall discuss three.problems. 


(a) Probability and Experience 

” The theory of probability had its origin in an empirical question: 
Chevalier de Méré’s gambling problem. Equally, the present-day physicist 
finds no difficulty in empirically testing probabilities which have been 


theoretically predicted, by measuring the relative frequencies of the . 


occurrence of certain events. On the other hand, the epistemological 
discussion on the meaning of the application of the so-called mathematical 
concept of probability to empirical reality is by no means settled. The battle 
is still raging between ‘objectivist’, ‘subjectivist’, and even other interpre- 
tations of probability. Probability is one of the outstanding examples of the 
‘epistemological paradox’ that we can successfully use our basic concepts 
without actually understanding them. In many apparent paradoxes which 
are connected with fundamental philosophical problems the first step 
towards their solution consists in accepting the seemingly paradoxical 
situation as a phenomenon, and in this sense as a fact. Thus we will have 
to understand that it is the very nature of basic concepts to be practically 
useful without, or at least before, being analytically clarified. This clarifica- 
- tion will have to use other concepts in an unanalysed manner. It may mean 
a step forward in such an analysis to see whether a hierarchy exists in the 
practical use of basic concepts, and which concepts then practically depend 
on the availability of which other concepts, and also to see where PERTE 
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interlink in a non-hierarchical manner. In Section 2 I shall try to show 
~ that one of the traditional difficulties in the empirical interpretation of- 
probability stems from the idea that experience can be treated as a given 
concept and probability as a concept to be applied to experience..'This is 
what I call a mistaken epistemological hierarchy. I shall try to point out 
that, on the contrary, experience and probability interlink in a manner 
which will preclude us from understanding experience without already 
using some concept of probability. I shall offer a particular way of intro- 
ducing probability in several steps. 


(b) The Probability Axioms of Quantum Theory 

While the aforementioned paradox is much discussed, another paradox 
which is more specialised but no less striking seems to have gone nearly 
unnoticed. The mathematical theory of probability can be presented in an 
axiomatic form; Kolmogorov’s axioms are still a classical formulation. 
The empirical use of the concept of probability then seems to mean that 
we give an empirical meaning to these axioms. The most universal theory 
of physics today is quantum theory. There is no phenomenon, at least in 
inorganic nature, which the present-day physicist would not consider to 
be subject to the laws of quantum theory. Thus one would expect that in 
the present situation of natural science the most fundamental test of any 
theory on the empirical meaning of probability would be its application to 
the probability concept of quantum theory. Yet I am not aware of any 
serious attempt to do this. Whoever tries it will in fact encounter a most 
fundamental obstacle: it is doubtful whether Kolmogorov’s axioms hold 
in quantum theory. The basic feature of quantum-mechanical probability 
is the ‘interference of probabilities’; the fundamental laws do not directly 
refer to the measu@®le probabilities but to the ‘probability amplitudes’. 
In the Kolmogorov system (see Section 3 of this paper) this means that the 
first axiom, which states that the possible events form a Boolean lattice, is 
to be replaced! by another axiom stating that the lattice of events is given 
by the subspaces of a Hilbert space. If we could consider experience as a 
given concept and probability as a mathematical concept to be ‘applied’ 
to the given experience, this change in the axioms might seem to be an 
innocuous formal readjustment. If, however, our understanding of the 
meaning of experience depends on our use of the concept of probability, 
the case is different. Thus I would like to condense the present situation in 
the epistemology of probability into the ‘paradox of quantum probabilities’: 
1 Whether this replacement is necessary or not, depends on the phrasing; cf. Section 3. 


* The change as compared to classical theory lies i in the distinction between two meanings 
“made by a difference in phrasing which would be irrelevant in classical theory. 
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_ If an axiomatic system of probability is to be applied to experience in 

accordance with the laws of nature as known today, this cannot be they ` 
classical theory of probability which has been nearly exclusively studied 
both by. mathematicians and by epistemologists. 

The difference between classical and quantum-theoretical probabilities 
precisely corresponds to the difference between classical and quantum 
physics. Most epistemologists, especially those of the school- of logical 
positivism, have considered this difference in physics as being of a 
‘merely empirical character’, i.e. as lying on a lower level in the scientific 
hierarchy than that on which decisions concerning the meaning of con- 
cepts like experience and probability are made. This is probably the 
explanation of their lack of understanding of the first-rate importance of 
quantum theory for the formulation of the very meaning of those ‘higher’ 
concepts. 

Bohr’s concept of complementarity was never understood because it 
was misinterpreted as a generalisation of a particular empirical concept 
of physics, while Bohr has intended it to indicate a universal structure of 
all human experience which could be particularly well exemplified only 
in quantum theory. Von Neumann indicated the universality of the 
problem by describing the non-Boolean lattice of quantum events in terms 
of a new logic, so-called quantum logic. He did not, however, resolve, 
perhaps not even appreciate, the ‘paradox of quantum logic’, that a 
particular experience here is said to lead towards a new logic, while all 
experience is produced by already using a logic which seems to be available 
and hence probably to be structurally fixed in advance. Now we will see 
in section 2 that in the process of establishing experience the concept of 
probability appears as a concept of a ‘logic of temporal propositions’. 
Prognostic statements on quantum events are propositions about the future 
with a double aspect, one of which subjects them to the classical logic of 
propositions while the other leads to quantum logic. Historically the 
discovery of this second logical possibility was induced. by the experience 
of atomic physics, but after having been discovered it can be understood. 
without reference to that experience. 


(c) The Universality of Quantum Theory. : 

Problem (b) assumes its striking nature Phe presuppose—as I actually 
did in expounding it—that quantum theory has a fairly universal validity. 
If we make such a presupposition at all we should not accept it as a mere 
historical fact, but we ought to transform it into a problem in itself, Is there 
a way of understanding the sweeping success of this particular theory?- 
Of course, all living physicists would hesitate to think of quantum theory 
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as formulating the final universal laws on the da of physical objects 

` sand on their change with time. They would hesitate, partly because they 
would hesitate to think that any set of laws could be final, partly because 
they may still consider the interpretation of quantum theory as unsatis- 
factory and obscure. But their actual acceptance of the empirical validity 
of the theory is not gravely embarrassed by these qualms. If we at all 
expect a new theory some day to replace quantum theory in a way similar 
to the replacing of classical mechanics by quantum mechanics, we think 
of such a new theory as being even more universally valid, as referring to 

` experiences not yet known or not yet understood today, and as containing 
quantum theory as a limiting case for the vast field of its applicability that 
is known today. The degree of universality allotted to quantum theory in 
this view is all we need for stating the problem. 

The problem will become clearer by stating it as an example of a more 
general apparent epistemological ‘paradox of universal theories’: Given the 
immense variability of possible experience, what is the likelihood for 
the validity of a very simple set of laws that will enable us to predict such 
experience for the future out of present experience? Stated in this way, 
the ‘paradox’ contains two different problems: (z) that there is any necessary 
connection between present and future (the problem of causality), (2) that 
the Jaws stating this connection are as simple as the well-known basic 
theories of physics, most of which today are embraced within quantum 
theory. I am not going to discuss these fundamental philosophical problems 
in this paper. I think that their solution can only lie in recognising the 
structures expressed by the theories as preconditions of the very possibility 
of experience. I shall only try to show in which particular sense this 
statement may apply in quantum theory. I think quantum theory is as 
universally valid as it in fact is because it formulates nothing but general 
laws of probability, including laws for the change of probabilities with time. 
If this view could be justified, it would close the circle of our considerations. 
We would have been right in using quantum theory as an argument in 
the interpretation of probability because we would thereby have used not 
an accessory trait of quantum theory but its very essence. 

In Section 3 I shall discuss these aspects of Quantum Theory. That 
section contains an interpretation of the formalism of second quantisation 
(more generally speaking, of multiple quantisation) as strictly correspond- 
ing to the steps by which classical probability is introduced in Section 2. 
It further describes the dynamical relation between these steps of quantisa- 
tion by means of the well-known quantum-theoretical interpretation of the 
-classical principle of action by Dirac and Feynman. 

1 I have described some aspects of this proposal in,my [1970]. 
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2 THE CLASSICAL CONCEPT OF PROBABILITY 


This section does not contain a strict theory of classical probability but an * 
outline of an analysis of its probability concept which emphasises those ' 
traits of the theory in which the epistemological difficulties usually arise. 
I hope that this analysis would suffice for establishing a consistent theory 
of classical probability in which we might follow any good textbook for the 
mathematical details. The term ‘classical’ here only means ‘not quantum- 
theoretical’, 

We shall interpret the concept of probability in a strictly empirical sense. 
We consider probability to be a measurable quantity whose value can be 
empirically tested no less than, e.g. the value of an energy or a temperature. 
What we need for defining a probability is an experimental situation in 
which different ‘events’ E}, Ep... are the possible results of one experi- 
ment. We further need the possibility of meaningfully saying that the same 
experimental situation prevails in different cases (in different ‘realisations’, 
‘at different times’, ‘for different individual objects’ etc.) and that, given 
this situation, the same experiment is carried out in each case. Let there be 
N performances of the experiment, and assume that event E, has occurred 
n, times. In this series of cases we will call the fraction 


the relative frequency with which E, bas occurred in the series. 

Now consider a future series of performances of the same experiment. 
Let us assume that our (theoretical and empirical) knowledge enables us 
to calculate a probability p, of the event Æ, in this experiment. Then we 
will take the meaning of this number p+ to be that it is a prediction of the 
relative frequency f, for the future series of performances. pẹ will be 
empirically tested by comparing it with the values of fẹ that will be found 
in this and further series of the experiment under consideration. 

This is the simplistic view of the ordinary experimentalist. I think it is 
essentially correct and it will only need to be defended against the objections 
of the epistemologists. Of course we hope to understand it better by defend- 
ing it. 

Let us use a simple example for formulating the main objection. Our 
experiment will consist in the single cast of a die. There are 6 possible 
events. Let us choose the event that a ‘five’ will appear as the one in which 
we are interested. Its probability pẹ will equal 1/6 if the die is ‘good’. 
Now let us cast the die N times. Even if N is divisible by 6, the fraction fg 
will only rarely be exactly 1/6, and, what is more relevant, the theory of . 
probability does not expect f; to be 1/6. The theory predicts a distribution 


* 


326 C. F. von Weissdcker 


of the measured values of f, in different series of casts around the theoretical 

„probability pş. The probability is only the expectation value of the relative 
frequency. But the concept ‘expectation value’ is generally defined by 
making use of the concept ‘probability’. Hence it seems impossible to 
define probability by referring it to measurable relative frequencies, since 
this definition, if strictly formulated, will have to contain the very concept 
of probability; it will—so it seems—be a circular definition. 

We will not evade the problem by defining the probability as the limiting 
value of the relative frequency for long series, since there is no strict 
meaning to a limiting value in an empirical series which is essentially finite. 
These difficulties have induced some authors to abandon the ‘objectivist’ 
interpretation altogether in favour of a ‘subjectivist’ one which, e.g. reads 
the equation p, = 1/6 as meaning: ‘I am ready to bet 1 against 5 that there 
will be a five next time’. The theory of probability then is a theory of the 
consistency of a betting system. But this is not the problem of the physicist. 
He wishes to discover empirically who will become a rich man by his betting 
system. I am not going to enter into the discussion on these proposals. 
I shall rather immediately offer my own proposal. ` 

The origin of the difficulty does not lie in the particular concept of 
probability but more generally in the idea of an empirical test of any 
theoretical prediction. Consider the measurement of a position coordinate 
x of a planet for a certain moment of time; let its value predicted by the 
theory be £. A single measurement will give a value &,, different from é. 
The single measurement may not suffice to convince us whether this result 
is to be considered as a confirmation or a refutation of the prediction. 
Then we will repeat the measurement N times and apply the theory of 
errors. Let £ be the average of the measured values. Then, comparing the 
distance | £— &| with the average scattering of the measured values, we can 
formally calculate a ‘probability’ with which the predicted value will 
differ from the ‘real’ value by a quantity d = | é— |. This ‘probability’ is 
itself a prediction of the relative frequency with which the measured 
distance | &— é| will assume the value d, if we repeat the series of measure- 
ments many times. This structure of the empirical test of a theoretical 
prediction is slightly complicated, but well known. We can compress it 
into the abbreviated statement: ‘An empirical confirmation or refutation 
of any theoretical prediction is never possible with certainty but only with 
a higher or lesser degree or probability.’ This is a fundamental feature of 
all experience. In the present paper I am satisfied to describe it and to 
accept it; its philgsophical relevance is to be discussed in another context. 

- Whoever is working in an empirical science has already tacitly accepted it 
by his practice. In this sense the concept of scientific experience in practical 
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use presupposes the applicability of some concept of probability, even if 
this concept is not explicitly articulated. Hence the very attempt of giving | 
a complete definition of probability by recurring on a given concept of 
experience is probably bound to lead into a definitional circle. Of course it 
would be equally impossible to define the concept of an empirical test by 
using a preconceived concept of probability. The two concepts, experience 
and probability, are not in a relation of hierarchical subordination. 

In practice every application of the theory of errors implies that we 
consider relative frequencies of events to be predictable quantities. In this 
sense probability is a measured quantity. This implies that our ‘abbreviated 
statement’ also applies to probability itself: The empirical test of a theo- 
retical probability is only possible with some degree of probability. The 
appearance of the probabilistic concept of an expectation value in the 
‘definition’ of probability is therefore not a paradox but a necessary 
consequence of the empirical nature of the concept of probability; or it is 
a ‘paradox’ inherent in the concept of experience itself. Still, probability 
is not on the same methodological level as all other empirical concepts. 
The precise measurement of any other quantity refers us to the measure- 
ment of relative frequencies, that is, to probabilities; the precise measure- 
ment of probability refers us to probabilities again. Due to this higher 
level of abstraction the predictions of the theory are better defined. The 
scattering of the measured values of any quantity around its average value 
depends on the nature of the measuring device; the scattering of relative 
frequencies around their expectations value is itself defined by the theory. 

Yet we have not so far achieved a definition of probability that would 
avoid the objection of being circular. I shall now sketch a systematic 
theory of probability as an empirical concept (t.e. a concept of a quantity 
which can be empirically measured). This is done in three stages. We first 
formulate a preliminary concept of probability. It does not aim at precision 
but being an understandable English description of the way in which 
probabilistic concepts are actually used in practice. Secondly we formulate 
a system of axioms of the mathematical theory of probability. In this section 
we can take over Kolmogorov’s system. Thirdly we give an empirical 
meaning, so to speak a physical semantic to the concepts of the mathematical 
theory by identifying some of its concepts with some concepts connected 
with the preliminary concept of probability. This triple procedure can 
also be described as a process of giving mathematical precision to the 
preliminary concept. The most important part in the third stage is a 
study of the consistency of the whole procedure. The interpreted theory 
of the third stage offers a mathematical model of those structures which- 
were imprecisely degcribed in the preliminary concept. I propose to call a 
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theory semantically consistent if it permits one to use the preliminary 
_ concepts without which it would not have been given a meaning in such 

a manner that this use is correctly described by the mathematical model 

offered in the theory itself. 

The preliminary concept is described by three postulates: 

A. A probability is a predicate of a formally possible future event or, more 
precisely, a modality of the proposition which asserts that this event 
will happen. 

B. If an event, or the corresponding proposition, has a probability very 
near to 1 or o, it can be treated as practically necessary or practically 
impossible. A proposition (event) with a probability not very near to 
o is called possible. 

C. If we assign a probability p(o < p < 1) to a proposition or to the 
corresponding event, we thereby express the following expectation:. 
out of a large number N of cases in which this probability is correctly 
assigned to this proposition there will be approximately n = pN cases 
in which the proposition will turn out to be true. 

The language in which we have formulated these postulates needs further 
explanation. We first see that restrictive concepts like ‘practically’, 
‘approximately’, ‘expressing an expectation’ are used. Their task is to 
indicate that our preliminary concept is not precise but should be made 
more precise. We will see that in this process these restrictive concepts 
will not be eliminated but be made more precise themselves. The word 
‘correctly’ in C indicates that we consider the ascribing of a probability to 
an event not as an act of free subjective choice but as a scientific assertion 
subject to test. 

The language of the postulates refers to a discipline which cannot be 
explicitly described in this paper: the logic of temporal propositions. 
Temporal propositions are statements which are not of a ‘timeless’ nature 
like the propositions of mathematics but which express the happening of 
an event in time. The central point in this logic is the distinction between 
propositions on the present, on the past, and on the future (‘it is raining 
(now)’, ‘yesterday it was raining’, ‘it will rain tomorrow’). For propositions 
about the future this logic proposes not to use the traditional truth values 
‘true’ and ‘false’, but the ‘futuric modalities’: ‘possible’, ‘necessary’, 
‘impossible’. These futuric modalities are to be distinguished from other 
uses of the same words. In postulate A such a distinction was made by 
calling an event ‘formally possible’. To use an example: If we say that it is 
formally possible that it will rain tomorrow we mean: ‘it will rain tomorrow’ 

-is a meaningful proposition on the future; hence futuric modalities can be 
ascribed to it. But knowing the weather forecast we may know that it is 
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‘actually impossible’ that it will rain tomorrow. What we indicate by the 
word ‘actually’ here is the futuric modality. 

Postulate A proposes to use probabilities as a more precise form of 
futuric modalities. With respect to the ordinary use of the word ‘probability’ ` 
this may be considered as a terminological convention: further on we wish 
to restrict the use of this word to statements about the future. But behind 
this convention lies the view that this is the primary meaning of probability 
and that other uses of the word can be reduced to it. E.g. we apply it to 
the past in saying ‘it is probable that it was raining yesterday’ or ‘the day 
before yesterday it was probable that it would rain the following day’. 
But in the second example probability is referred to what then was future, 
characteristically we here say ‘it was probable’. In the first example we 
first of all admit lack of knowledge concerning the past; to make the state- 

- ment operative we have to apply it to the future in the sense ‘It is probable 
that, if I investigate, I will find out that it was raining yesterday’. 

The restriction of the primary use of probabilities to the future opens 
the choice mentioned in Section 1. But we will discuss this in Section 3, 
confining ourselves to classical probabilities now. 

For the mathematical theory we can take over literally Kolmogorov’s 
text, only changing some notations: 

‘Let M be a set of elements £7, f,... which we call elementary events, 
and F a set of subsets of M; the elements of F will be called events. 
I. F is a lattice of sets. 
II. F contains the set M. 
III. To every set A of F we assign a non-negative number p(A4). This 
number p(A) is called the probability of the event A. 
IV. p(M) = 1 
V. If A, and A, are disjunct, then 
l p(Ay-+A,) = 9(A,)+9(Ay). 

We leave out axiom VI, which formulates a condition of continuity, since 
we will not discuss its problems here. We need, however, the definition of 
the expectation value: 

‘Let there be a partition of the original set M 

M=4A,+4A,+ ...+A4,, 
and let x be a real function of the elementary event € which is equal to a 
constant a, on every set A,. Then we call x a stochastic quantity and consider 


the sum 
E(x) = 2 a,p(A,) 


the mathematical expectation of the quantity «.’ 


330 C. F. von Weizsäcker 


We now turn to the physical semantic. In order to simplify the expression 
and to concentrate on the essentials we assume the set M of elementary 

‘events to be finite. We call the number of elementary events K; in the 
` case of the die K = 6. We further consider a finite ensemble of N equal 
cases, e.g. of casts in the case of die. To every elementary event E, (we 
write E, instead of Kolmogorov’s £; 1 < k < K) we assign a number n(h) 
which indicates how many times this event E, (say the five on the die) 
has actually happened in the particular series of N experiments which 
forms our given ensemble. Correspondingly we assign a n(A) to every 
event A. It is easy to see that the quantities 

a= 
fulfil Kolmogorov’s Axioms I to V if we insert them for p(A). This model 
of the axioms is, however, not the one intended by the theory of proba- 
bilities, but we reach our goal by adding a fourth postulate to the pre- 
liminary concept: 
D. The probability of an event (of a proposition) is the expectation value 
of the relative frequency of its happing (its coming true). 

The expectation value used in D is not defined on the original Jattice of 
events F. It can be defined on a lattice G of ‘meta-events’. We call a meta- 
event an ensemble of N events belonging to F which happen under equal 
conditions. We here use the language that the ‘same’ event can happen 
several times (‘it has been raining and it will be raining again’). G is not a 
subset of M or of F, but it is a set of elements of F with repetitions. Now 
we can assign a probability function p(A) to F (it may express our expecta- 
tions on the events A according to the preliminary concepts). Then the 
rules of the mathematical theory of probability permit us to calculate a 
probability function for the elements of G; it is only necessary to assume 
that the N events which together form a meta-event can be treated as 
independent. Assuming the validity of Kolmogorov’s axioms for F we 
then can prove their validity for G and the validity of the formula 


(A) = (¥) (2.1) 

We can now forget our preliminary ideas of the meaning of the p(A) in F. 
Instead we can apply the three postulates A, B, C to the lattice G of 
meta-events. After having thus given an interpretation (in the preliminary 
sense) to the p in G, we use (2.1) for deducing an interpretation of the p 
in F. It is exactly what postulate D says: p(A) is the expectation value of 
the relative frequency of D. If we now remember again how we would Have 
interpreted the p in F without this construction, we would only have used 
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A, B, C. This preliminary concept is now justified as a weaker formulation 
of D. The concepts ‘practically’, ‘approximately’, ‘expectation’ can now be 
more precisely interpreted by errors estimates. The mathematical ‘law of 
large numbers’ proves that the expectation values of these errors tend to 
zero for growing N. 

What have we gained epistemologically? We have not got rid of the 
imprecise preliminary concept, we have only transferred it from events to 
meta-events, t.e. to large ensembles of events. The physical semantic for 
probabilities rests on the preliminary semantic for meta-probabilities. 
This is a more precise expression of our earlier statement, that a proba- 
bility can only be empirically tested with some degree of probability. 
The solution of the paradox lies in its acceptance as a phenomenon: No 
theory of empirical probabilities can be meaningfully expected to yield 
more than this justification which at least makes its consistency more 
evident. 

If we like, we can iterate our process and call this ladder of meta- 
probabilities a ‘regressive definition’ of probability. While the usual 
recursive definition offers a fixed starting point (n = 1) and a rule of 
recursion from n+1 to n, the regression here goes as high as we like. 
At some step of the ladder we have to stop and to rely on the preliminary 
concepts, Due to the ‘law of large numbers’ it will suffice for this highest 
step to postulate Æ and B. This will yield A, B, and C for the next-lower 
step, and D for the ones below that. 


3 PROBABILITY IN QUANTUM THEORY 


It is possible to build up quantum theory from simple postulates analogous 
to those used in the preceding section. This has been done by Drieschner.t 
I must refer to this paper for a more ample justification of the view expressed. 
in Section 1 that quantum theory is essentially a new theory of proba- 
bilities. The present section is confined to three tasks: Using ordinary 
quantum-mechanical language I shall very briefly indicate the point of 
departure of the quantum theory of probabilities from the classical theory. 
Then I shall repeat the construction of meta-events and show its connection 
with the method of second quantisation; this was not done by Drieschner. 
I finally correlate this procedure with Feynman’s formulation of quantum 
theory. 

As I said in Section x (footnote 1) it depends on the phrasing whether we 
have to abandon Kolmogorov’s first axiom in quantum theory. The phras- 
ing.refers to the physical semantic. It depends on what we consider to be 


1 Drieschner [1968], I have given a brief presentation of this work in my [1970]. 
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our sêt of events. If we confine them to the possible results of one experi- 
ment (measurement of one observable or one set of commuting observ- 
“ables), we will not have to change the axiom. But if we consider as the set 
of events belonging to one object the possible results of all possible 
experiments that can be performed on this object, then their lattice is given 
by the subspaces of a Hilbert space. This ambiguity is due to the fact that 
in quantum theory there are incompatible experiments. How this fact is 
connected with the other structures of quantum theory is well-known and 
not to be discussed here. Drieschner’s work can be described as an 
axiomatic system of empirical probabilities in which the choice is left open 
as long as possible whether there are incompatible experiments or not. 
This is the choice mentioned in Section 1(b). 

In Von Neumann’s quantum logic the calculus of propositions is adapted 
to the change in the lattice of events; this means a readjustment in the 
rules of negation and disjunction. In order to understand the logical 
meaning of this formal procedure we will have to analyse the meanings of 
the two lattices in terms of the logic of temporal propositions. We have 
interpreted probabilities as quantitative modalities which replace the 
classical truth-values for futuric propositions.1 Now arguments of mathe- 
matical beauty would induce us to expect that in a quantum logic of 
futuric propositions the fundamental modality will not be the probability 
but the probability amplitude. The question then arises whether we can 
justify the probability interpretation of quantum theory by starting out 
from wave functions (i.e. Hilbert space) and some very simple postulates of 
measurability.? We shall confine our present argument, however, to the less 
ambitious task of showing the consistency of the probability interpretation 
of quantum theory by repeating within its frame-work the theory of meta- 
events of the preceding section. 

Consider a quantum-mechanical measurement of one observable. For 
the sake of simplicity we again assume it to admit only of a finite number of 
different possible results. Let this number be precisely K = 2; we then 


1 It should be noted that this is not a ‘many-valued logic’ in the narrow technical sense 
in which this term is now mostly used. In such a logic the logical functors are defined by 
truth-matrices exactly as in classical logic; only there are more than two truth values. 
That this is impossible for modalities is clear by the simple example: let there be three 
modalities: necessary, impossible, contingent (f.e. neither necessary nor impossible). 
If the conjunctive functor ‘and’ is to be a one-valued truth function, ‘p and g’ must be 
contingent if both p and g are contingent; further, if p is contingent, non-p must also be 
contingent. Now insert non-p for g. The result would be that the contradiction ‘p and 
non-p’ is contingent, hence not impossible, if p is contingent. The logical wisdom behind 
this apparent difficulty is that the definition of functors by truth-functions is a rather 
artificial contraption which only works under the peculiar premisses of classical logic. 

2 This procedure would be complementary to Drieschner’s. He starts out postulating the 

` probability concept as meaningful for all measurements and deduces the Hilbert-space 
formalism by additional postulates. 
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have a simple alternative, e.g. the measurement of the spin of an alkali 
atom in a Stern-Gerlach experiment. The two possible results will be 
called k = 1 and k = 2. With respect to this alternative, leaving aside its ` 
other degrees of freedom, the object in which the alternative can be decided 
has a two-dimensional Hilbert space. We call its state vector u, (k = 1, 2). 
If u, is normalised, the probabilities of finding the results 1 or 2 are 


Pi = uta, py = Ug ug (3-1) 
bitha = 1. (3.2) 


Now consider a statistical ensemble of N such objects in which the same 
alternative can be decided. Let x be found in n, cases, and 2 in ną cases: 
nita =N (3-3) 
We will treat the ensemble as a real ensemble, i.e. as a quantum- 
mechanical object composed of N of the simple objects. This is formally 
possible even if the measurements are made at different times, but we 
leave aside this case in which we would have to give a more complicated 
description (including the problem of symmetry); we shall treat the 
measurements as simultaneous. The general state of the ensemble, in 
which the alternative does not need to be decided, can be described by a 
wave function in the configuration space, of 2" dimensions. We simplify 
the calculation by making an assumption on the symmetry of this wave 
function. Let it be symmetrical, that is let us assume Bose statistics for the 
simple objects; Fermi statistics would have confined us to the uninteresting 
case N < 2. The state can then be described by a wave function 9(n,, na). 
The set of all y(n, ng) for any nj, ng describes all possible ensembles with 
finite values of N; a particular ensemble will have a fixed N. 
If the ensemble consists of N objects in the same state u,, we get 


(ny 13) = Cun ust (3-4) 
We normalise, taking account of (3.3): 
2 P* (my, Ma) P(n ng) = È lnm” PEPE =1 (3-5) 
Since 
(Prt) = Sam n Pr! T= I, (3-6) 
we get 
N! 
[Enn]? = mal (3-7) 


The numbers n, and n, can be interpreted as eigenvalues of the operators 
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n, and n, which act on y by multiplying it with n4 or na. The expectation 
value of n, in ọ is 
ny = D p* (np )N1P ne) 


= NHN — Np pat ... HNpap? to 


d 
hay (PitPs)” = pN (pi Hb) = pN. (3-8) 
Thus we get 
n= (3-9) 


in accordance with postulate D of Section 2. The calculation is easily 
generalised for larger K. 

This little calculation was nothing but the simplest case of second 
quantisation. The index k is a two-valued quantum-mechanical observable. 
The operators n, can be built out of operators us, u, obeying the com- 
mutation rules 

usuy — ušu, = Òr 
Uy Uy, Uy, = Ufuy—ufut = o (3.10) 
according to 
Ny = upuy. (3-11) 

Second quantisation has generally been considered as some clever 
formal device. It could be proved that it is equivalent to the configuration 
space method, but it was never quite clear what the iteration of the 
quantisation process really meant. I think that this can be understood in 
the present context. 

Second quantisation indeed establishes an ensemble of equal objects, 
each of which, if isolated, would be described by the wave function of the 
first quantisation. On the other hand, the formalism of second quantisation 
is a correct procedure of quantisation. This indicates that quantisation 
might generally be a process of ensemble-building according to the 
peculiar rules of probability that are characteristic of quantum theory. 
And this is exactly the thesis of the present paper: quantum theory is 
nothing but a general theory of probability, t.e. of expectation values of 
relative frequencies in ensembles. To apply quantisation to a wave function 
is conceptually exactly the same step as to apply the concept of expectation 
value in the definition of a probability. 

Feynman! has given a formulation of quantum theory in which it 
becomes explicit that is only a new probability theory. For the sake of 
simple expression"we shall divide time and space into discrete steps. We 


1 Feynman [1948]. 
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then can describe the path of a particle in space in the following way: Let 
the particle be at a position x, at time ft). There will be a probability 
p(x, %o) of finding it at x, at the next time, if it was at x» before. p(xgxq) ` 

will be found according to classical probability theory as i 


P(*2) 2o) = 2 Pla *1)P(%1, xo). (3.12) 
All that quantum theory changes is to replace this by 

(Xa) Xo) = 2 Plo #1) P(*1 Lo) (3.13) 
together with the rule 

P(e Xn) = [P(e %n)]?. (3-14) 


In both theories there is a law for the change of state in time that deter- 
mines the values of the p(x; %,) or y(x,%,). In classical physics it is generally 
expressed for the single case, and its most relevant formulation is the 
action principle 


ty 
5s = Í Ldt =o. (3.15) 
h 


From this law the law of change of probabilities can be derived. In quantum 
theory there is no deterministic law for the single case; it is replaced by 
the Schrödinger equation for the probability amplitude: 


iy = Hy. (3.16) 


The main result of Feynman’s theory is the connection between these two 
laws. The Schrédinger equation is equivalent to the rule that 


Pai Hq) = ell * S020), (3-17) 
t.e. the classical action is exactly the phase of the probability amplitude for 
the corresponding possible orbit of the particle. 

According to this theory the process called ‘quantisation’ can be read in 
two directions. Following the historical origin of modern quantum theory 
in Bohr’s correspondence principle one would take the classical theory as 
given; Feynman’s law (3.17) then leads to the correct corresponding 
quantum theory. From the point of view of a fully developed quantum 
theory one would take the action S in Feynman’s law as a given function, 
and one would use his theory as a way of finding the limiting case which 
is called classical. But Feynman’s theory, if read in the second direction, 
also explains why there should be a ‘classical’ limiting case and hence a 
process called ‘quantisation’. It is the case in which certain probabilities 
tend: towards one and zero and hence nearly precise predictions of the — 
corresponding measurements become possible. In this limiting case the- 
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concept of probability is no longer needed for describing the single event. 
Historically, such a case will in general be understood earlier, and 


` ‘quantisation’ then is the step towards the more refined theory of proba- 


bilities.? 

The actual ‘reverse reading of quantisation’? or ‘understanding of 
classical physics by quantum theory’ is closely connected with the role 
played in classical physics by variational principles. Why should laws 
of nature have the form of variational principles? 

In classical physics all those theories that can describe single events 
can be formulated by variational principles; mechanics, geometrical optics, 
field theories. The exception is thermodynamics, more strictly speaking 
the theories of irreversible processes like, e.g., heat conduction. But 
irreversible processes, according to statistical thermodynamics, never are 
truly single events; they are essentially probabilistic. All these facts are 
explained by Dirac’s? and Feynman’s theories, essentially by applying 
Huygens’s principle. At the classical orbit the phase S has an extremal 
value and hence in its neighbourhood all possible orbits, having nearly 
the same phase, add up. In classical thermodynamics it is exactly the phase 
relations of quantum theory that are neglected; hence this phase-effect 
does not occur. 

All this is well known. It explains the validity of classical variational 
principles by the quantum theory behind it. But nobody seems ever to 
have asked why quantum theory itself (t.e. the Schrödinger equation) also 
can be derived from a variational principle. I think the present interpreta- 
tion of second quantisation answers this question. There is a quantum 
theory behind quantum theory, precisely because probabilities can only be 
defined with the help of probabilities. This means no more than that for 
every possible object of quantum theory there also exists a possible 
quantum-theoretical object consisting of many equal objects of the former 
level. Hence one is Jed not only to second but to multiple quantisation.§ 
The mathematical analysis of this structure has to be left to another 
paper. 


Max-Planck-Institut 

zur Erforschung der Lebensbedingungen 
der wissenschaftlich-technischen Welt, 
Starnberg 


1 There is a connection between this consideration and the concept of measurement as 
an irreversible proctss which I do not discuss here, 
' 3 Dirac [1933]. 
3 Weizsäcker, Scheibe and Sussmann [1958]. 
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Brain Bisection and Personal Identity” 
by ROLAND PUCCETTI 


Introduction 

Lateralisation of Consciousness. 
Voices of Dissent. 

Hemispheric Conflicts. 
Cerebral Dominance. 
Hemispheric Awareness. 

Two Minds, One Person? 

Who Am I? 


Naw bhW DN 


INTRODUCTION 


It is customary to think of a human being as having a single brain, posses- 
sing a unitary mind, constituting a unique individual person. However 
recent studies of patients whose cerebral commissures! have been sectioned 
to prevent interhemispheric spread of epileptic seizures suggest a very 
different state of affairs. 

The operation is relatively simple in conception, if not in execution. 
Both a frontal and a posterior opening are made in the top of the skull, 
followed by mid-line sectioning of the corpus callosum,? the anterior and 
hippocampal commissures,® and in some cases the massa intermedia as 
well. Upon recovery these patients appear to function just about as well in 
ordinary situations as before the operation, except for some loss of short 
term memory. However under testing conditions, where information is 
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1 ‘cerebral commissures’ = all the main nerve connections between the two hemispheres 
of the brain. X 

* ‘carpus callosum’ = the principal commissure, containing about two hundred million , 
nerve fibres. . 

? ‘anterior and hippocampal commissures’ = lesser nerve connections. à 
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projected into each disconnected hemisphere independently, the picture 
.changes. Sperry [1968a], Sperry, Gazzaniga and Bogen [1969], and 
Gazzaniga [1970] have provided exhaustive descriptions of the deconnec- 
tion syndrome. I shall review here only the principal results. 


I LATERALISATION OF CONSCIOUSNESS 


Given an intact optic chiasm! and decussation® of the visual pathways, 
what is flashed into the left visual half field? of the commissurotomy 
patient too rapidly for eye movement is seen by the right hemisphere only 
(and vice-versa). If, for example, a picture of a key, or the word ‘key’, 
appears in the left half field and you ask the brain-bisected subject what 
he saw he answers that he saw nothing, or at most a flash of light. This 
is because in right-handed people the speech centres are in the left half 
brain, which has not had the information relayed to it from the right 
hemisphere, since commissural connections are now gone. The right half 
brain, however, has seen and understood what was the object or word; it 
just cannot say what it was. Thus if you ask the patient not to name the 
object or word but to nod his head or otherwise indicate it non-verbally, 
say by pointing it out among an array of objects produced by the examiner, 
he will successfully identify it. Asked how he was able to do that, he 
replies that he does not know, or confabulates. That is of course the left 
hemisphere talking. Similarly, if two distinct figures are presented simul- 
taneously, say a dollar sign in the left half visual field (feeding into the right 
hemisphere) and a question mark in the right half visual field (going to the 
left hemisphere), and he is asked to draw out of sight with his left hand 
(controlled by the right hemisphere) what he saw, the patient will un- 
hesitatingly draw the dollar sign but may report that he has drawn a 
question mark. 

Parallel results are obtained with the other sensory modalities. Objects 
placed in the left hand out of view cannot be named but can be identified 
non-verbally. If two objects are placed, one in each hand, out of sight, 
then removed and scrambled with other objects behind a screen, each 
hand will retrieve its own object while passing over the other’s. In audition 
sounds are heard with both ears, but a conflict in verbal instructions has the 
result that each hemisphere attends to the opposite ear. Thus Gordon 
1 ‘optic chiasm’ = bundle of fibres leading from each retina of the eye back to the visual 

a TAR = crossing over from one side of the body to the other side of the brain. 
3 ‘left visual half feld’ = what appears to the left of ‘fixation’, or the visual centre; this 
. strikes the right half of the retina in both eyes and is relayed to the right visual cortex in 
+ the right hemisphere. Vice-versa, of course, for the right visual half field impinging on 
- the left half of the retina of each eye. s 
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[1972] told a split-brain subject through the right ear to scratch the table 
while telling him through the left ear to point to the ceiling. He did both, , 
that is he scratched the table with his right hand and waved the left in 
the air. Asked why he was doing this, he denied hearing the left ear 
command, In olfaction the neural pathways are same-sided; thus if the 
left nostril is occluded, an odour introduced into the right nostril gets into 
the right, speechless hemisphere of the commissurotomised subject only. 
Here the patient will express unpleasantness by saying something like 
‘Phew!’, but he cannot tell the examiner whether it was garlic, cheese, 
rotten egg, etc. (Gordon and Sperry [1968]). 

As a result of this lateralisation of consciousness, each hemisphere is 
able to undertake different, even conflicting discrimination tasks at the 
same time. Gazzaniga and Sperry [1966] have shown that the brain- 
bisected subject can discriminate for colour in one hemisphere and for 
brightness in the other as rapidly as he can learn either task alone. What 
we have here, then, is an apparent doubling of conscious awareness, with 
independent cognitive and volitional processes at work in the same body, 
and distinct memory traces building up which are no longer accessible to 
recall by the other hemisphere. It seems that when a human brain is split 
anatomically, the consequence is less a halving than a doubling of its 
functions. 


2 VOICES OF DISSENT 


Not everyone familiar with these reports of experimental findings, however, 
accepts this interpretation. Eccles has said [1967] that goings-on in the 
minor cerebral hemisphere! are not experienced at all! In a more recent 
discussion (Eccles [1970]), he compares the right hemisphere to a com- 
puter which selects objects presented to the left half visual field with the 
left hand, but quite unconsciously. In normal people, asks Sir John, 
What is the functional importance of the minor hemisphere other than to receive 
from the sense organs, to do complex computations thereon, then to transmit to 
the dominant hemisphere, and finally to receive from this hemisphere and 
transmit to the muscles of the opposite side? 


In the absence of verbal reports from the minor hemisphere, he says, we 
must remain agnostic about its being conscious just as we must in the case 


of dumb animals. 
But if speech is a necessary condition of consciousness, then the aphasic? 


1 ‘minor cerebral hemisphere’ = right hemisphere; mute or nondominant (especially for 
spedch in right-handed humans). : 

* ‘aphasic’ = person rendered speechless by damage to the speech centres (usually in- 
the temporal lobe of the left hemisphere). . 
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who can play the piano—as Ravel did—is playing unconsciously, and 
_ the person who has a left hemispherectomy' is a walking automaton. 
In commissurotomy tests where the right hemisphere has had, say, a key 
presented to it visually and the patient is asked what he has seen, the 
speech hemisphere often guesses while the subject’s head nods negatively 
at each wrong guess, or frowns disapprovingly. What is this if not a 
conscious, wilful reaction on the part of the right hemisphere? If Eccles’ 
argument about one half brain being unconscious is taken seriously, why 
could it not be reversed? Why not say the major hemisphere? is unconscious 
and computer-like, since in fact it is far better at calculation and linear-like 
analytic functions than the right half brain? 

It is perhaps important to stress here just what the right hemisphere of a 
typically right-handed commissurotomy patient can do. If he is asked to 
retrieve an object with his left hand that ‘goes with’ a visual stimulus 
projected to the minor hemisphere, but may not match it, he selects a toy 
wrist watch after seeing a wall clock, a coin after seeing a dollar sign, a 
nail or spike after seeing a hammer. Clearly the mute hemisphere has the 
concept of, respectively, timepieces, currency, tools (Sperry [1968a)). 
Even moderately advanced definitions of what to feel for, like ‘measuring 
instrument’, ‘eating utensil’, ‘container for liquids’ or ‘used for slicing’ 
will allow the right hemisphere to retrieve a related item, e.g. a ruler, a 
spoon, a glass and a knife (Sperry and Gazzaniga [1967], Sperry [1968d]). 
The minor hemisphere, then, does have language comprehension and at 
least a rudimentary verbal conceptual scheme; it is simply unable to 
utilise these for speech production or writing. The suggestion that it can 
do all these things unconsciously is not logically defeasible, but neither is 
the suggestion that people can talk unconsciously (as they sometimes do). 
‘Other-hemisphere’ scepticism seems to be in the same epistemological 
boat as scepticism about other minds. 

MacKay ([1966a] and [1966d]) has a very different sort of criticism to 
offer. He says there is only one person who is conscious, but that in 
commissurotomy this person has a split control system which enables 
him to pay attention to one half of his bodily environment at a time; and 
that both hemispheres are nevertheless subject to a single ‘metaorganising 
system’ which assigns values or response probabilities to the organism as 
a whole. 

As for the first point, it does not seem able to accommodate the fact that 
commissurotomy subjects are able to attend, not merely to stimuli presented 


1 left hemispherectomy’ = removal of the entire, or most, of the left cerebral hemisphere 
. by surgery. ‘ 
-3 ‘major hemisphere’ = left cerebral hemisphere in right-handed persons, speaking or 


. dominant hemisphere, š 
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consecutively to each side of the body, but to stimuli of quite different and 

even conflicting sorts at the same time. Why then speak of a single, split 

control system? Why not speak of two independent control systems?’ 
As for the second point, Johnson and Gazzaniga have reported experiments 

in which split-brain monkeys were not able to transfer reward. values from 

one hemisphere to the other, even for something so vital as water [1971]; 

and experiments in which human splits were unable to connect the stimulus 

with the reward, when these are projected to different hemispheres, in 

over thirty trials [1969]. Where, then, is the single metaorganising system 

postulated by MacKay? 


3 HEMISPHERIC CONFLICTS 


It might well be wondered why, if both hemispheres of the split-brain 
patient are conscious and capable of independent volition, there are not 
frequent conflicts between them. Sperry [1968a] provides a partial 
explanation as follows. 


The fact that these two separated mental spheres have only one body, so they 
always get dragged to the same places, meet the same people, and see and do the 
same things all the time and thus are bound to have a great overlap of common, 
almost identical experience [is one unifying factor]. The unity of the optic 
image—and even after chiasm section in animal experiments, the conjugate 
movement of the eyes—means that both hemispheres automatically center on, 
focus on, and hence probably attend to, the same items in the visual field all 
the time. 

Other unifying factors, of course, would be access to the same sounds 
and smells from the immediate atmospheric environment; rapid eye 
movements scanning objects to right and left, hence getting pretty much 
the same visual information into each hemisphere; common subroutines 
of learned behaviour stored in the still intact cerebellum!; and a common 
internal milieu (blood sugar, hormone levels, etc.). Then too, sectioning is 
not complete in human subjects: some autonomic, humoral and muscular 
reactions of the body are shared, either by peripheral sensory feedback or 
via the intact brainstem, the shared vascular system, cerebrospinal fluid, 
and so on. Since primary drives are mediated at subcortical levels, it is 
no surprise both hemispheres feel hungry, thirsty, lustful, or what have 
you at the same time. Even in test conditions an emotional reaction gets into 
both hemispheres. I have already mentioned the case of a patient reacting 
with disgust to an unpleasant smell he cannot specify verbally; another is 


1 ‘cerebellum’ = ‘small brain’ nestling under the roof of and at the back of the main 
brain; it enables us to perform complicated learned activities like walking or brushing” 
our teeth or playing tennis largely unconsciously. 
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recounted by Gazzaniga [1970] where a nude pin-up flashed to the right 
hemisphere led the female patient to chuckle and say the machine was 
funny, though of course she couldn’t say what was funny about it. 
Nevertheless conflicts do occur, especially in the early months of post- 
operative recovery. Zaidel [1972] reports a patient shortly after the opera- 
tion pushing a plate away with one hand and retrieving it with the other, 
only to have it pushed away again. Sperry [1966] found this most pro- 
nounced in the first subject studied, again in the early postoperative period. 
He was reported by his wife to sometimes pull up his trousers with one 
hand and push them down with the other while dressing; or the left hand, 
having helped the right to tie the belt of his robe, would promptly untie it 
again. On one occasion the left tried to push away his wife while the right 
hand was beckoning her; once it seemed to threaten her and had to be 
restrained by the right hand. In Sperry’s phrase, there seems to be in 
such incidents a conflict between ‘willpower-right’ and ‘willpower-left’. 
However Sperry [19724] and Gazzaniga [19722] caution that this patient 
had extensive right hemisphere brain damage before the operation. 
Bogen [1972], on the other hand, points out that there is always brain 
damage—associated with epilepsy—in these patients, and says Case I 
may be more typical than others because the injury occurred at a mature 
age (30), when he was well lateralised for all functions.1 In any event the 
number of cases studied so far is too small to be sure how typical hemis- 
pheric conflicts are of the commissurotomised human subject. 


4 CEREBRAL DOMINANCE 


Still another consideration that enters in here, though it is not a unifying 
factor.so much as an overruling one, is that in the split-brain patient each 
hemisphere is usually able to exert its control over the whole motor system. 
Thus for example a triangle projected into the right half brain can be 
drawn by the patient’s right hand provided the left, dominant hemisphere 
does not see another figure, say a square, at the same time. If it does, the 
right hand is always obliged to draw the square. (However if the pencil is 
in the left hand, where the left hemisphere’s ipsilateral motor control is 
weaker than the right’s contralateral one, the right hemisphere has its 
way and draws a triangle.) 

The whole question of cerebral dominance is a knotty one into which we 
are just beginning to get insights. Other animals do not show it. While 
rats, for example, favour one or the other forepaw, it is just as likely to be 

1 ‘well lateralised for all functions’ = speech production and higher language compre- 


`, hension, as well as advanced analytical abilities, confined to the left hemisphere; 
, Gestalt perception confined to the right hemisphere. 
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the left as the right one. And there is no evidence that other twin-brained 
mammals, even sub-human primates, have differentiated cerebral abilities: 
whatever one hemisphere can do the other does just as well. But humans 
are preponderantly right-handed, or left-brained. As Sperry put it [1972], ` 
the speech hemisphere remains the more aggressive, executive or leading 
hemisphere even after surgical separation from the mute hemisphere. 
Why at age two or three originally bilateral verbal processes begin to 
retreat from the right brain and concentrate in the left one alongside the 
usually dominant centres of motor control is not clear. But perhaps there is 
selective advantage in this. Perhaps it opens the way for a quite different 
sort of information-processing in the speechless hemisphere. 

What might that be? After commissurotomy right-handed patients 
develop what Bogen [1969] calls ‘dyscopia’ of the right hand and ‘dys- 
graphia’ of the left. That is to say that while they were preoperatively able 
to write their names and copy figures almost as well with either hand, 
after the operation they can hardly write at all with the left hand or copy 
figures with the right, though the opposite hand does as well as always. 
Obviously graphic ability was and remains characteristic of the speech 
hemisphere, but configurational tasks are right brain directed. On the 
hypothesis that the analysing function of language is antagonistic to 
Gestalt perception, Levy [1969] suggested there may be selective advantage 
in lateralising the integrating Gestalt function into the mute hemisphere 
where it could develop more fully. To test her idea she administered the 
Weschsler Adult Intelligence Scale to right-handed and left-handed 
graduate students at California Institute of Technology. Since left handers 
have some bilateralisation of speecht, she expected them to do less well on 
the performance part of the WAIS, which involves Gestalt appreciation. 
Interestingly, there was only eight IQ points difference between’ verbal 
and performance scores for dextrals, while the discrepancy for sinistrals 
was twenty-five points: a range so large that it would occur by chance less 
than twice in ten thousand trials. 

This combination of language processing and motor dominance in the 
speech hemisphere has converted the minor hemisphere into something 
of a cerebral helot. Gazzaniga [1973], following earlier animal studies made 
with Sperry, notes that a commissurally intact patient with lesion® of the 
left hemisphere shows much less right hemisphere cognitive or decision- 
making ability than does the right hemisphere of a commissurotomised 
patient. Presumably intact interhemispheric connections allow the lesioned 
left hemisphere to continue inhibiting upper level functions; it is only by 

i 1 ‘bilateralisation of speech’ = speech centres in both hemispheres. 
3 ‘lesion’ = organic damage to brain cells, 
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disconnection that the right hemisphere asserts itself and shows what it 
can do. 

But as anyone who has servants knows, dependence works both ways. 
You can fire your maid, but then you have to do the housecleaning 
yourself. Saul and Sperry [1968] report a case of total agenesis! of the 
corpus callosum—the main commissural connection—where there never 
was mutual dependence between the hemispheres. This patient was 
above average in verbal IQ but definitely subnormal in perceptuomotor 
tasks like block design, drawing in perspective, or anything involving 
Gestalt perception. A possible inference here is that verbal faculties 
developing in both hemispheres prevented full development of configura- 
tional skills in either. That the speaking hemisphere relies on the mute 
companion half brain for visuospatial information-processing is further 
evidenced by the fact that a simple pattern task like judging which two 
zig-zag figures are oriented in the same direction elicits a verbal response 
from the normal, cerebally intact subject much more quickly when first 
presented to the right hemisphere. If first presented to the left hemisphere, 
it takes 14 msec longer to get a response: indicating that the pattern was 
sent over to the minor hemisphere for decoding prior to the dominant 
half brain giving ‘its’ answer (Gibson, Filbey and Gazzaniga [1970]). 

The nondominant hemisphere reveals its role in other ways. Levy, 
Trevarthen and Sperry [1972] describe an experiment in which half-faces 
are shown to each hemisphere of a split-brain subject. The right hemisphere, 
for example, sees a young woman, the left a young child; asked to point 
to the correct picture among several with the right hand, the patient points 
to the picture of the young woman, even though that hand is normally 
controlled by the hemisphere that saw a child. Further evidence is found 
in the fact that lesions of the right parietal lobe often cause facial agnosia: 
such people fail to recognise relatives they see every day; sometimes they 
cannot recognise themselves in a mirror (Hécaen and Angelergues [1962]). 
Clearly the speaking hemisphere depends heavily on its silent, normally 
submissive partner for major cognitive functions. 

And finally, though the connection is by no means obvious, it appears 
we sing more with the right hemisphere than the left. Using Amytal 
injections in the right carotid artery,? which depresses the right hemisphere, 
Bogen and Gordon [1972] found grossly disturbed singing in six patients, 
though speech was hardly affected. However left carotid injections, 
depressing the left hemisphere, were not conclusive in three patients: one 
1 ‘agenesis’ = failure ‘to grow naturally. 


"3 “carotid artery’ = the principal supplier of blood to the brain coming from the heart to 
` each side of the neck and feeding into the cerebral hemisphere on that side. 
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remaining silent, the other singing in a melodic but slurred way despite 
temporary paralysis of the speech centres, and a third with temporarily 
depressed left hemisphere singing familiar songs, though unable to talk. 


5 HEMISPHERIC AWARENESS 


One problem that immediately confronts anyone who accepts the more 
obvious implications of commissurotomy studies is the following: if indeed 
there are two mental spheres, must they not be aware of each other’s 
existence? This would be particularly expected in test conditions where, 
for example, the subject is asked what he felt or saw or did and cannot 
answer, or just guesses, while one part of his body knows the answer and 
can give it non-verbally for him; or even expresses annoyance at his wrong 
responses. Yet all the indications seem to be that the speech hemisphere, 
that one capable after all of introspective verbal reports, simply ignores 
the existence of his silent servant; and the nondominant partner never 
acknowledges, so far as we know, his talkative cerebral companion. 

This is strikingly exemplified in the observation by Sperry [1968] that 
after commissurotomy the patient never complains about loss of half the 
visual field. Whereas before the operation he could, say, name all the 
objects to right and left of the examiner while gazing into his eyes, post- 
operatively he can name only those to the right. Yet he seems not to 
notice this loss; indeed he does not respond to an appropriate question 
about the difference from the examiner. As Gazzaniga says [1970]: ‘It is as 
if the mechanism for the realisation that vision was once available across 
the midline exists only when the callosum is intact.’ But if long-term 
memory is not impaired by commissurotomy, how could such an enormous 
informational deficit go unnoticed by the speaking hemisphere? It would 
be like the myopic person not remarking that everything has gone blurry 
when he takes off his spectacles. 

- Gazzaniga [19725] suggests the following parallel. When someone has a 
lesion in the peripheral visual system, e.g. in the retina or optic nerve, 
he notices a hole in his visual field because he retains memory of that area 
being filled with visual stimuli that no longer come in. However when the 
scotoma? is due to a lesion in the visual cortex? itself, there is no sense of 
loss because all memories associated with it are gone too, and the remaining 
cortex has no way to match the absence of new stimuli with a stored record 
of previous visual data for that area. Apply this now to the split-brain 
1 ‘gcotoma’ = blank spot in the visual field. 
* “Visual cortex’ = that concentration of neurons on the surface of the brain at the rear of 
each hemisphere where, when these discharge, one has visual experiences of the 
opposite half of the yjeual field. 
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subject. If memory of the whole left visual field is stored in the right 
half brain, how would the left hemisphere notice its absence? A myopic 
_ person who lost memory of seeing objects clearly each time he removed. 
his glasses would not comment on perceptual blurriness either (Gaz- 
zaniga [1972a]), just as the infant myopic does not. 

But if so, why speak at all of losing half the visual field? Perhaps it is a 
mistake to talk of cerebral hemispheres, or half brains—though it is 
anatomically accurate to do so. At least with regard to something so basic 
as perceptual unity, we might do better to speak of the left brain and the 
right brain. On that interpretation neither brain has lost anything in terms 
of its integral function; it has lost only free information about the other 
brain’s visual field. The left brain, as always, is able to describe objects in 
its visual field. The right brain, as before, is unable to name objects in its 
visual field. It is only because we persist in thinking of these two brains as 
sharing a single visual field that we misleadingly call their visual fields 
half fields. 

Just as striking evidence for hemispheric self-centredness is available in 
the introspective verbal reports of the left brain after commissurotomy, 
already noted. Why does not the speaking braint, except very rarely and 
in a limited, quite guarded way, recognise another mind at work in problem 
situations where the left hand successfully retrieves an object the speech 
brain can only guess at? Yet it usually does not. It simply evades the issue, 
says it doesn’t know how it did that, or at most disavows having done it but 
offers no explanation. In part, of course, this is the result of having assimi- 
lated a verbal conceptual scheme in which there is only one mind or person 
per body. Sherrington [1947] expressed it grandiloquently as follows: 
This self is a unity... it regards itself as one, others treat it as one. It is 
addressed as one, by a name to which it answers. The Law and the State schedule 
it as one. It and they identify it with a body which is considered by it and them 
to belong to it integrally. In short, unchallenged and unargued conviction assumes 
it to be one. The logic of grammar endorses this by a pronoun in the singular. 
All its diversity is merged in oneness. 

Yet one wonders. The right brain, which has some language compre- 
hension but probably far too little for assimilation of such an abstract 
notion as this, is unlikely to be verbally conditioned to the same extent as 
the left brain. With its highly imagistic, non-verbal thought processes, 
might it not see what the left brain cannot? Sperry [19685] reports that 
when the right brain is spelling out a word like ‘coat’ with block letters in 
the left hand behind a screen, the major hemisphere intrudes verbally, 
saying “This is an M’ when the minor hemisphere knows it is an ‘A. 

T ‘speaking brain’ = the left hemisphere in right-handed (and some left-handed) people. 


Brain Bisection and Personal Identity 349 


In that case it just ignores the speech brain’s chatter and gets on with its 
spelling. But must it not be aware of another mind competing for the, 
answer, particularly since it expresses annoyance at the wrong answers it 
hears? Of course we do not know—it makes no introspective reports. 
But in silence there could be greater understanding. 

Consider the experiment by Levy, Trevarthen and Sperry [1972] already 
mentioned, where different half-faces are simultaneously projected to the 
separated hemispheres. In order to test the existence of a subordinate 
percept in the hemisphere which had not made an identification, routine 
responses were deliberately blocked. For example, before the subject 
could point to the face he had seen (right hemisphere), the visual stimulus 
array was removed and he was asked first to say what he had seen (left 
hemisphere). Or conversely, before he could say what he had seen he was 
told first to point to the correct face. As expected, the verbal response and 
the manual response conflicted. I quote: 


In these double responses, the conflict between the verbal and the manual 
responses became evident to the subjects and resulted in a considerable per- 
plexity and confusion. This was lessened by both the brevity and the incomplete- 
ness of the exposure which tended to leave a somewhat uncertain impression of 
each stimulus half. When the verbal response came first under direction from the 
left hemisphere the right hemisphere had to choose between the announced 
verbal selection and its own strong impression favouring another of the faces. 
In only one trial in four did this result in loss of the response preferred by the 
right hemisphere. When the manual response came first the verbal hemisphere 
had to decide whether to be consistent and describe the face that had already 
been pointed to or to ignore this in favour of its own recall of a different face. 
Again, omission of all reference to the second (right half) face occurred only 
once in four trials (pp. 67-8). 


My interpretation of this is as follows. In the intact cerebrum the right 
hemisphere does just about all facial recognition, which is then relayed to 
the left brain for naming. In the commissurotomy patient under test 
conditions, the minor hemisphere is still dominant for recognising faces, 
so where a manual response is indicated it points with either hand to the 
face it saw and the speech hemisphere goes along, being much less sure 
what it saw. In a verbal response situation, however, the left hemisphere is 
dominant and the right brain, being mute, hears a face named it did not see. 
Now apply this to double responses. Expecting to name the face it saw, 
the left hemisphere sees one or the other hand pointing to a face it did not 
see and was not going to name. Its trained response has been thwarted, so 
more often than not it rejects the manual identification in favour of its 
own. No doubt this is perplexing to the speech hemisphere. It seems to it. 
as if its body was inexplicably out of step with its intention. 
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But precisely because it is a verbal hemisphere, and the only voice it 
„ever hears issuing from its body is its own, it cannot very well break with 
verbal conditioning and acknowledge a rival centre of conscious control in 
the same body. There is for it, as Sherrington said, but one mind or person 
per body. 

Now what of the mute hemisphere? It is accustomed to identifying 
faces and hearing them named, but before the operation had no reason to 
think it was not doing the naming as well. Postoperatively, and normally 
only in these special test conditions, it hears a voice issuing from its own 
body which it now has reason to believe comes from another, distinct centre 
of conscious control. The face named was not the face it saw, not the face 
it was going to point to before the examiner blocked that response. But all 
it can do is stick to its guns and, more often than not, continue pointing to 
a face different from the one named. Still it is in a better position, I think, 
to realise the true situation. It is so because even if without an abstract 
notion of selves or minds, the speechless hemisphere has a life-long associa- 
tion of language heard and understood with conscious control of a body, 
and it knows the words it heard came from its body but not from itself. 
Interhemispheric agnosia may after all go only from left to right in the 
disconnected human brain. 


6 TWO MINDS, ONE PERSON? 


Scientists writing on the interpretation of cerebral commissurotomy often 
talk as if we must reconcile the notion of two minds with the concept of a 
single person whose minds they are, Thus Geschwind [1965] says he and 
Edith Kaplan had to learn to refer separately ‘to that part of the patient 
which could speak normally’ and to ‘that part of the patient which “knew” 
(non-verbally) what was in the left hand’. Gazzaniga [1970] puts it this way: 
that ‘mind-left’ did not know what was in the left hand, but ‘mind-right’ 
did. Sperry [1964] uses this formula: that the surgery has left these com- 
missurotomised people with ‘two separate minds’.* And Bogen [19695] says 
flatly that splitting the brain shows each of us ‘has two minds in one person’. 
But how could two minds co-exist in one person? Here, of course, we 
are not talking about a schizoid personality with divergent impulses; or 
of dissociative reactions in the same person: the commissurotomy subject 
exhibits neither of these syndromes. Rather we are concerned with the 
* In his ([1968c], p. 314), Sperry says subjects who deny seeing anything in the left 
visual field ‘are telling us only a half truth’. More recently he says also: ‘It is most 
compelling to see the same individual performing the same test task in very different 

. ways, and with different strategies, depending on whether he is using his left or his 
- right hemisphere’, (See ‘Lateral Specialisation in the Surgically Separated Hemi- 
- spheres’, by R. W. Sperry, in The Neurosciences: Third Study Program, p. 7, in press.) 
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question whether it makes sense to say two distinct cognitive experiences 
eliciting different responses from a single body are had by one and the 
same person. It is possible for a person P at time T; to know he has, say, a’ 
key in his left hand—and not a coin or a marble—yet not know what is in 
that hand? If so, we should be able to say intelligibly of P that at time T, 
he both knew and did not know he held a key in his left hand. But that 
certainly sound contradictory, and if it is we could not say this intelligibly 
at all.* More glaring still, consider the case just discussed where incom- 
patible half-faces are seen by the deconnected cerebral hemispheres at the 
same time. Asked which face he saw among an array of several, P says 
he saw a child’s face while he points at the picture of a young woman. 
Would this not be a double contradiction if P is in fact a single person, 
since ‘mind-left’ denies seeing a woman while ‘mind-right’ (manually) 
denies seeing a child’s face? Either commissurotomy results in two 
minds or it does not; if it does, it also yields two persons. 

But even this last statement is misleading. How can commissurotomy 
create two minds or persons if there was just one before? Which mind, the 
left brain-based one or the right brain-based one, is brand new? And how 
are we to make a choice here? Both brains, as we have seen, were conscious 
and functioning in their rather specialised ways before the operation. 
It is just that they then functioned more synchronously—because of the 
commissural connections—and no longer do so in test situations (or not 
nearly as much) because they have been surgically separated. Thus even in 
the normal, cerebrally intact human being there must be two persons, 
though before the era of commissurotomy experiments we had no way of 
knowing this. + 

Consider the experiment cited earlier from Gibson, Filbey and Gaz- 
zaniga [1970]. A figure presented to the left hemisphere of a normal is 
* It has been suggested to me by two colleagues, Alexander Rosenberg and Richmond 

Campbell, that there would be no contradiction if we say both hemispheres are the 
hemispheres of a single person, P, and hence it is simply not true to say P does not 
know what he is holding in his left hand: he knows it with his right hemisphere. Now it 
is the case that there is a person who knows what is in the left hand. If it is the same 
person whose left hemisphere has speech centres, why does he not say it is a key, 
instead of guessing that it is a coin or a match? And if this person’s right hemisphere, 
hearing the mistaken guess, frowns and shakes his head, must we not say P disagrees 
with himself about what is in the left hand? 

t However A. L. Wigan suspected and believed we have two minds as early as 1844, 
when he published a book called The Duality of the Mind. Wigan’s first thoughts in 
this direction were prompted by observing an autopsy on a man he knew well who 
conversed rationally and had even written verses shortly before his death, but who 
proved to have only one cerebral hemisphere. (Vide: Bogen [19690].) Several other 
nineteenth-century writers came to similar conclusions. The reason I say we did not 
know this was the case until recently is that only in cerebral comrhissurotomy does one 


get clear evidence of distinct mental spheres functioning simultaneously; and of course ° 
I am arguing that two minds is logically equivalent to two persons. 
a 
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named. verbally 14 msec more slowly than if first presented to the right 
hemisphere, suggesting that visual information of this sort is sent over to 


“the minor hemisphere for processing. In that case at a given moment T, 


the left brain has a visual stimulus from the right visual field 7 msec. 
before the right brain has it (or a bit less, depending on how long the 
processing takes). This is an insignificant time difference, perhaps, but 
it allows us to say there will always be some difference in the information 
content of each brain. Similar considerations apply to hearing a sound that 
strikes one eardrum slightly before the other, because the atmospheric 
disturbance emanates from a source closer to it; or to an odour reaching 
one nostril before the other and a tactile stimulus affecting one side of the 
body. There is never complete parity of mental content at any given 
instant. 

Consider also, in the monumental study by Bogen [19695], evidence 
cited of 185 cases involving gross ablations or even complete hemi- 
spherectomy where there remained a ‘person’, no matter which hemisphere 
was gone. Now if the person were unitary despite the duality of mind 
Bogen postulates, it ought to be the case that only half the original person 
has survived. Yet over and over again clinical reports suggest that essentially 
the same personality, character traits and long term memory traces 
persist postoperatively. The only way I can see to explain this is to say the 
same ‘person’ did not survive hemispherectomy at all. Because this former 
‘person’ was never a unitary person to begin with. He or she was a com- 
pound of two persons* who functioned in concert by transcommissural 
exchange. What has survived is one of two very similar persons with 
roughly parallel memory traces, nearly synchronous emotional states, 
perceptual experiences, and so on, but differential processing functions. 

Bogen [1969] recounts clinical examinations of two patients recovering 
from opposite-sided cerebral hemispherectomies. I quote at some length 
from his report, but will condense the more technical parts. 


G.E. was 28 years old when she first became aware of incoordination in the left 
hand. This progressed over several months to complete paralysis. [She eventually 
had a complete right hemispherectomy, sparing only parts of the basal ganglia."] 
When seen on July 19, 1967, G.E. was an attractive woman with no impairment 
of voluntary facial movement but some impairment of the left side when smiling 
or frowning. She spoke excellently and was very quick to understand the test 
procedures. She walked reasonably well, including up and down stairs, with a 
short leg brace on the left leg and a cane in her right hand. Her left arm was in 


* I have developed the notion of our being compound persons also in the more exotic 
and fanciful context of certain philosophical puzzle cases involving personal identity. 
Vide my [1973]. 

} ‘basal ganglia’ = subcortical structures associated with slow limb movements. 
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spastic flexion but the hand could retain an object placed in it. She wrote 
fluently with the right hand but could copy only with great effort. She was 
fully aware of both her high verbal intelligence and of her defect in copying, - 
and at the conclusion of the interview said, ‘Just imagine how smart I would be 
if I had both halves of my brain!’ 


E.C. was 47 years old when [his left hemisphere was removed] . . . He was seated 
in a wheelchair with his right hand in typical spastic posture! resting in his lap. 
A large bone plate had been removed, leaving a sizeable depression, but he had a 
pleasant face and manner, smiling symmetrically and extending his left hand to 
shake my hand when we were introduced... His speech consisted almost 
entirely of ‘yes, but’—with an occasional ‘but-uh, no’ (usually at an appropriate 
time) and from time to time he would say ‘God damn it’... His disability in 
speaking was in marked contrast to his good comprehension of speech. Also, he 
frequently picked the correct picture (of four offered) when shown a printed 
word... When asked to write ‘something’ [i.e., anything that came to mind] 
he made a capital F£’ and a small squiggle. He was then asked to write the word 
‘cat? and he wrote a capital ‘C’, ‘o’ and ‘x’. He was asked if this was an attempt 
to write his name and he said ‘God damm it’... In contrast with this minimal 
production with a great deal of effort, he was able to copy a variety of geometric 
figures with ease... Judging from his facial expression, he was very pleased 
that he was able to do this so easily. 


- It seems clear from the above that if one could combine G.E.’s verbal 
and graphic skills with #.C.’s Gestalt abilities, we would have a normal 
human being. We would also have, given their complementary motor 
controls, an organism showing no signs of bodily paralysis. In other words, 
someone like you or me. But G.E. and E.C. are not half persons. If not, it 
follows they were dual persons before hemispherectomy. G.E. was wrong 
to say she does not have both halves of her brain. She has her brain all 
right; it is just that she is no longer being helped out (in copying figures, for 
example) by another person now gone because of a tumor. 


47 WHO AM I? 


If that is so, if we cerebrally intact twin-brained human beings are really 
compounds of two persons, which is me? Am I the person whose conscious 
unity is rooted in left brain information-processing and right hand motor 
control; or am J the person whose consciousness is based in right brain 
activity and subordinate left hand control? 

It seems to me that anyone who can get into the intellectual position to 
ask such a question must recognise he is left brain-based. (There would 
be a few exceptions, perhaps one in thirty.) What I mean is that raising 
the question itself presupposes a high degree of verbal abstraction, which is — 


1 ‘spastic posture’ = curled up and immobile. 
` 
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normally absent from the right brain. Before I argued that the language- 
poor right brain would, perhaps, be better able than the speech brain to 
" recognise another centre of conscious control working in its body after 
commissurotomy. That is, I think, still true. But to know this much and 
to know, after scientific acquaintance and philosophical reflection, that we 
are each of us but one of two persons in a single body, is another matter. I 
cannot believe my right-sided cerebral companion contributed much to the 
writing of this paper, for example, or understood it as he saw me writing 
it out. He leaves the philosophising to me. 

But what if ‘I’ were a sculptor, painter or musical improvisor? Bogen 
[19695] presents an impressive array of clinical evidence, some of it going 
back to the nineteenth century, that patients with gross injury or even 
complete hemispherectomy of the left brain are able to carry on these 
kinds of artistic creativity post-traumatically, and in some cases achieve 
greater work thereafter. If so, I would have to admit to myself that it 
was not me who deserved credit for such accomplishments, but the 
untalkative right brain-based person I have been verbosely overruling 
most of our lives. And if I got up to deliver a speech acknowledging praise 
for those accomplishments, I think I would be very humble indeed. 
I might even wish there were a way for me to send my apologies silently 
across the commissures.* 


Dalhousie University 


REFERENCES 


Bocen, J. E. [19694]: “The Other Side of the Brain I: Dysgraphia and Dyscopia Following 
Cerebral Commissurotomy’, Bulletin of the Los Angeles Neurological Society, 34, 
PP. 73-105. 

Bocen, J. E. [19698]: “The Other Side of the Brain II: An Appositional Mind’, Bulletin 
of the Los Angeles Neurological Society, 34, pp. 135-62. 

Bocen, J. E. [1972]: Personal conversation with the author. 

Bocen, J. E. and Gorpon, H. W. [1972]: ‘Musical Tests for Functional Lateralization 
with Intracarotid Amobarbital’, Nature, 230, pp. 524-26. 

Eccres, J. C. [r967]: ‘Evolution and the Conscious Self’, in J. D. Roslansky (ed.): The 
Human Mind, p. 11. 

Eccuss, J. C. [1970]: Facing Reality: Philosophical Adventures by a Brain Scientist, pp. 76- 
80. ` 

GAZZANIGA, M. S. [1970]: The Bisected Brain. 

GAZZANIGA, M. S. [1972a]: Personal conversation with the author. 

GAZZANIGA, M. S. [1972b]: Personal communication to the author. 

GAZZANIGA, M. S. [1973]: ‘Cerebral Dominance Viewed as a Decision System’, in S. J. 
Dimond and J. G. Beaumont (eds.): Hemispheric Functions. 

GAZZANIGA, M. S. and Sperry, R. W. [1966]: ‘Simultaneous Double Discrimination 
following Brain Bisection’, Psychonomic Science, 4, pp. 262-263. 


* I have profited from discuasion of the philosophical implications of cerebral commis- 
*. surotomy with Joseph E. Bogen, Glenda M. Bogen, S. A. M. Burns, Alexander Rosen- 
_ berg, Richmond Campbell, and Simon Vieyra, 


s s 


Brain Bisection and Personal Idehtity 355 


GESCHWIND, N. [1965]: ‘Disconnection Syndromes in Animal and Man’, Brain, 88,-p. 637. 

Grsson, A. R., Fuser, R. A. and Gazzaniaa, M. S. [1970]: ‘Hemispheric Differences as 
Reflected by Reaction Time’, Federation Proceedings, 29, p. 658. . 

Goron, H. W. [1973]: Verbal and Non-Verbal Information Processing in Man for Audition. 
Doctoral thesis, California Institute of Technology. 

Gorpon, H. W. and Sperry, R. W. [1968]: ‘Lateralization of Olfactory Perception in the 
Surgically Separated Hemispheres of Man’, Neuropsychologia, 7, pp. 111-20. 

Hécarn, H. and ANcELERGUES, R. [1962]: ‘Agnosia for Faces (Prosopagnosia)’, Archives 
of Neurology, 7, pp. 92-100. 

Levy, J. [1969]: ‘Possible Basis for the Evolution of Lateral Specialization of the Human 
Brain’, Nature, 224, pp. 614-15. 

Lavy, J., TREVARTHEN, C. and Sperry, R. W. [1972]: ‘Perception of Bilateral Chimeric 
Figures Following Hemispheric Deconnection’, Brain, 95, pp. 61-78. 

Jounson, J. D. and Gazzaniaa, M. S. [1969]: ‘Cortical-Cortical Pathways Involved in 
Reinforcement’, Nature, 223, p. 71. 

Jounson, J. D. and Gazzaniaa, M. S. [1971]: ‘Some Effects of Nonreinforcement in 
Split-Brain Monkeys’, Physiology and Behaviour, 6, pp. 703-6. 

MacKay, D. M. [1966a]: Discussion in Eccles, J. C. (ed.): Brain and Conscious Experience, 
PP. 312-13. 

MacKay, D. M. [1966]: ‘Cerebral Control and the Conscious Control of Action’, in 
Eccles, J. C. (ed.): Brain and Conscious Experience, pp. 422-44. 

Puccerti, R. [1973]: ‘Multiple Identity’, The Personalist, 54, 2. 

Sau, R. E. and Spgrry, R. W. [1968]: ‘Absence of Commissurotomy Symptoms with 
Agenesis of the Corpus Callosum’, Neurology, 18, p. 307. 

SHERRINGTON, C. [1947]: The Integrative Action of the Nervous System. 

Sperry, R. W. [1964]: ‘Problems Outstanding in the Evolution of Brain Function’, 
James Arthur Lecture, American Museum of Natural History. 

Sperry, R. W. [1966]: ‘Brain Bisection and Mechanisms of Consciousness’, in Eccles, 
J.C. (ed.): Brain and Conscious Experience, pp. 298—313. 

Sperry, R. W. [19684]: ‘Hemisphere Deconnection and Unity of Conscious Awareness’, 
American Psychologist, 23, pp. 723-33. 

Sperry, R. W. [19683]: ‘Mental Unity following Surgical Disconnection of the Cerebral 
Hemispheres’, Harvey Lectures, Series 62. 

Sperry, R. W. [1968c]: ‘Plasticity in Neural Maturation’, Developmental Biology Supple- 
ment, 2, pp. 306-17. 

Sperry, R. W. [19724]: Personal conversation with the author. 

Sperry, R. W. [1972]: ‘Hemispheric Specialisation of Mental Faculties in the Brain of 
Man’, Claremont Reading Conference, Thirty-Sixth Yearbook, 

Sperry, R. W. and Gazzanioa, M. W. [1967]: ‘Language Following Surgical Discon- 
nection of the Hemispheres’, in Milikan, C. H. (ed.): Brain Mechanisms Underlying 
Speech and Language, pp. 108-21. 

Sperry, R. W., Gazzanica, M. S. and Bocsn, J. E. [1969]: ‘Interhemispheric Relation- 
ships: the Neocortical Commissures; Syndromes of Hemisphere Disconnection’, 
in P. J. Vinken and G. W. Bruyn (eds.): Handbook of Clinical Neurology, 4. 

ZADEL, D. [1972]: Personal conversation with the author. 


Brit. J. Phil. Sci. 24 (1973), 357-408 Printed in Great Britain 357 


Discussions 


ON THE LOGICAL RELATIONS BETWEEN EXPRESSIONS OF 
DIFFERENT THEORIES 


The problem of logical relations between rival theories has occupied a promi- 
nent place in the recent literature of the Philosophy of Science. Already at the 
turn of the century, however, it was at the centre of controversy between 
Poincaré and LeRoy. Later, in the early 19303, the claim that there are non- 
intertranslatable languages and, in particular, that those of Newtonian and 
Special Relativistic Mechanics are examples of such, became the central tenet 
of radical conventionalism as developed by Ajdukiewicz.2 A few years later, a 
similar conceptual disparity between the classical and relativistic theories was 
noted independently by Frank. Frank’s treatment was recently recalled in this 
Journal by Giedymin, who has pointed out that it can be generalised to similar 
cases of rival theories with the help of certain ideas of Carnap.* The present 
paper, which treats this point in greater detail, leads under suitable assumptions 
to the following conclusion. If two empirical theories are inconsistent in respect 
of consequences all of whose expressions are regarded as having meanings 
determined independently of the theories in question (e.g. consequences describing 
the results of experiments), no theoretical expression of either theory is translatable 
into such an expression of the other, neither can expressions of this type stand in 
relations of entailment or inconsistency.* 

Logical questions concerning relations between expressions of different 
theories are similar to, and indeed presuppose, questions concerning translation 
between languages. For instance: under what conditions is a sentence of one 
language translatable into a second language by means of a sentence of that 
language equivalent in meaning with the first? Answers to questions of this type 
are determined only with the choice of conception of language and, thereby, of 
meaning. Different answers are to be expected depending on whether, for example, 
a language is regarded as a system of concepts and rules for their use, or as a 
collection of expressions and criteria for the acceptance or rejection of sentences 
in which they occur, or as a complex of dispositions to verbal behaviour, etc. 

In order to decide which conception of language is the most suitable, it is 
necessary to determine the consequences of adopting them for various purposes. 
This is one of the functions of the present paper with respect to its chosen 
conception, whose embodiment in the notion of a semantical system was already 
employed in an earlier paper.® This conception finds a place for both extensional 
and intensional constituents of meaning*; but no other constituents are taken into 
1 Cf. Giedymin [1973]. 

2 Its history will be traced in Giedymin [1974]. 

3 Giedymin [1973]. 

‘ A theoretical expression is any whose meaning is dependent on the specific postulates 
of the theory in question. 7 

5 Wiliams [1973]. i 

¢ To the extent that the latter can be constructed on the basis of the a priori component 


of a semantical system: see Definition 1 below. g 
~ ` 
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accouht. Thus once such a system is regarded as given, we do not enquire into 
the way in which its expressions come to acquire the meanings, ¢.g. extensions, 
* they have.1 The solution determined by this conception for problems related 
to translation between languages can be stated briefly as follows. 
Let o, and o, be sentences of the languages L,, La respectively. The following 
questions may be raised: 


Are o, and o, equivalent? 
Does c entail o,? 
Are c, and o, inconsistent? 


Let L be a suitably constructed extension of both L, and La. Then, relative to L, 
these questions can be settled by determining whether or not the following 
sentences are analytic in L: 
(7, = 03); 
(1 > 93); 
~(o, A Ta). 

If L is to be suitable for the logical correlation of all sentences of L, and Ly, 
each expression of L, or L} must also be an expression of L, In addition, all 
expressions of L, should have the same meaning in L as they have in L,; the 
expressions of L, should likewise retain their meanings in L. Otherwise the 
logical correlation given by L will be between languages other than L, and L}. 

The requirements just stated can undoubtedly be made exact in a variety of 
ways, each corresponding to a different conception of language and of meaning. 
Granted, however, the conception of language embodied in a semantical system, 
these requirements are met if and only if L is a conservative extension of both L, 
and La. These notions can be made precise as follows. 

We deal with the elementary predicate languages with identity in standard 
formalisation. The same symbols are assumed to be used as the logical constants 
and the denumerably many variables of each language. Apart from differences 
in non-logical constants (finitary relational symbols), the same expressions are 
regarded as well-formed formulae. If Y is such a language (or grammar), 
V(L), Form(L), Sent( L) denote, respectively, the sets of non-logical constants, 
well-formed formulae and sentences of Z. 

Tt is assumed known under what conditions a non-empty relational system W 
is said to be a structure for Z in the classical two-valued sense, in symbols YW e 
Str( L). The universe, or underlying set, of W is denoted by |U]. If Z, Z” are 
languages such that VY) < V(¥’'), given W e St(L’) the structure for Z 
obtained by omitting from W the relations correlated with symbols in V(.#’)— 
V(L) is denoted by W’|pg) and called the contraction of W to Y. Again if 
VL) SVL) and X” & Sir( L’), Xlv; denotes the class of all contractions 
to Z of members of Æ”. 

If We Sir(L), pe Form L) and a e |N]? is a denumerable sequence in [W], 


1A semantical system may be regarded as the formal semantical counterpart of the 

pragmatic conceptidn of language of Ajdukiewicz. The present conception derives more 

«directly, however, from the works of more recent Polish logicians, e.g. Wdjcicki [1966], 
Przelecki [1969], Praetecki & Wojcicki [1969]. 
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the reader is assumed to be familiar with the conditions under which a satisfies 
p in U, written a e y™. Then if 6 < Form L) and X S Str( F), we define 


Mod(P) = (We Sir): |W? = Qe" 


TH) = {p € Form(L): X = Mod({¢})}; 
Cn() = {pe Form): AW E ML) en s 4l} 


When 2 & Sent( L), it follows that 


Cn(Z) = Th(Mod(2)). 

Lastly, if X S Sir(£) the class of cardinals c such that # contains a structure of 
power c, namely the spectrum of X, is denoted by Spec( Æ). When X S Sent(), 
Spec(Z) will be written in place of Spec(Mod(Z)) and, when o e Sent( L), Spec(c) 
in place of Spec({c}). 

DEFINITION 1: The triple L = <P, of, > is an (elementary) semantical 

system, in symbols L e SSL, iff: 

(i) L is an elementary language (grammar); 

(ii) of = SP); 

(it) of is closed under isomorphism; 

(iv) KoA; 

(o) A +9; 

(vi) [M] = [W| for all M, W in A. 

The first component of a semantical system L consists of a collection of primi- 
tive symbols and rules of formation. Its second component is to be understood 
as the family of ‘possible worlds’ for the system, whilst the third represents, 
roughly speaking, the ‘actual world’. The semantical rules of the system, 
however, may fail to single out a unique structure as its object of reference. 
Then the denotations of non-logical constants are only partially determined 
(vague) and .# contains more than one member. On the other hand, the common 
universe of structures in æ, called the proper universe of L, is understood to be 
determined uniquely.t The sets An(L) and Tr(L) of analytic and true formulae of 
L may now be defined by: 

An(L) = THA); 
Tr(L) = TAM). 
DEFINITION 2: If L = <L, of, My and L = < L', Xf’, A’) are semantical 
systems, L’ is a conservative extension of L, in symbols L’ e CE(L), iff: 
O VL) Ss VL); 
(#) f= L'n; 
(ttt) A FaF Alve 
A conservative extension is one in which new expressions are introduced but old 
ones retain their original meanings. 

The logical correlation of two semantical systems L,, L, will be effected by 

1 Further explanatory remarks on the notion of a semantical system may be found in’ 


Williams [1973], pp. 401-2. A 
. `~ 
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means of a further system L which is to be a common conservative extension of Ly 
and L, in symbols L e CE(L,, La). Thus 


CE(Ly, La) = CE(Ly) © CE(L,). 
It can easily be seen that CE(L,, La) is non-empty if and only if 
[* Lalo = Lalo, 


where 
Lilo = (Ly, L. ilre» Mire? (i= 1,2) 

and 

VWL) = VL) o L. 
For [*] to hold it is necessary that (a) Spec(.o/,) = Spec(of,) and that (b) the 
proper universes of L, and L, coincide. (a) and (b) are also sufficient if V( 2.) = 
Ø. If these conditions hold, yet CE(L,, La) is empty, L, and L, have non-logical 
constants in common, of which some are used ambiguously in the two systems. 
This situation can be accommodated by replacing non-logical constants common 
to both systems by distinct constants, without otherwise altering either system. 
The question whether some of the constants initially common to the two 
systems were nevertheless equivalent in meaning can still be raised in connection 
with the revised systems. On the other hand, if either (a) or (b) fails, it is no 
longer possible to correlate L, and L} by means of a common conservative 
extension as defined above. Since the failure of these conditions need not 
exclude another means of correlating such systems, the present treatment 
restricts itself to the correlation of systems satisfying (a) and (b) for which a 
common conservative extension exists (after possible relabelling of constants), 
leaving open the satisfactory treatment of systems failing to satisfy them. 
In the cases to which special attention will be paid below, however, [*] holds 
and therefore (a) and (b) also. 


DEFINITION 3: If L,, Lge SS} and CE(L,, La) + 9, p, e Form(L£,) and p 

e Form(£,) are intertranslatable iff there is a system L e SSL such that: 

(i) Le CE(Ly, La); 

(#) (p1 = pa) E An{L). 
We say also that q} is translatable into Ly relative to L if and only if L satisfies 
(i) and (#) for some p, e Form(£,,), and that q; is translatable into Lẹ if and only 
if p, and p, are intertranslatable for some p, € Form(2,). 

Remark 1: Under the hypotheses of the definition, all analytic sentences of 
L, and L, are intertranslatable and, likewise, all contradictory sentences. So far 
as sentences are concerned, we are dealing with a concept of ‘free’ translation.? 
Without entering further into the problem of defining ‘literal’ translation for 
sentences, the following distinction can be made for languages. L, is said to be 
literally translatable into L, if there is a system L e SSL relative to which every 
atomic formula of L, is translatable into Lẹ. A yet stronger concept of literal 


1 The extension of this treatment to the general case, which can in fact be made in several 
ways, gives rise to no“difficulties of principle, only of calculation. It seems best therefore, 
- in a first airing of these ideas, to restrict the discussion to the simpler cases. 

E Cf. Kemeny [1956], pp. 159-60. 
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translatability is obtained by the demand for a system L relative to which every 
atomic formula of L is translatable into L, by an atomic formula of Lẹ. 


Remark’ 2: Let p, be translatable into L} by both p and pş relative to L. 


Then 
(pa = pi) E€ An(L) N Sent(L,) = An(La). 
It follows that translation relative to a single system is unique to within analytic 
equivalence. On the other hand, if p, is translatable into La by p, relative to L 
and zg relative to L’, it no longer follows that (pa = g4) ¢ An(L,). There may 
therefore be essentially different ways of translating pı into La. The same holds 
for translation of languages in both senses. We arrive, by this route also, at the 
indeterminacy of translation 
A number of inter-linguistic relations can be defined in a similar way. 
DEFINITION 4: If Ly Lae SSL and CE(L,, La) + %, o, € Sent(L,) entails 
a, E Sent( L) relative to L iff: 


(i) Le CE(L,, La); 

(ii) (0, > 0) € An(L). 

DEFINITION 5: If Ly, Lae SSL and CE(L,, La) + $, o,e Sent(L,) and 
og E Sent(£,) are inconsistent relative to L iff: 


@) Le CE(Ly, La); 
(i) ~(o, A ca) € An(L). 

Besides relations between individual expressions, the present framework 
provides for the definition of logical relations between sets of expressions of 
different languages, e.g. reducibility. Given sets of sentences T,, Tg and the 
further set X, T, is said to be reducible to T, relative to Æ if and only if the 
additional assumptions Æ establish such a connection between expressions of T, 
and T, as to establish also the derivability of T, from Ta with their help.? Thus: 


DEFINITION 6: If Lọ Lae SSL, CE(Ly, Ig) #9, Ty S Sent(L,), T S 
Sent(L,), E S Sent(L) where VL) = WL) VU LE), T, is reducible 
to T, relative to Z iff there is a system L e SSŁ such that: 


G) Le CE(Ly Ly); 

(#) Ti S Cn(An(L) Y SU Tì)? 
In general X may include both factual and analytic sentences. However, when 
Z S An(L), T, is said to be (simply) reducible to Tẹ} or when, although 
2 € An(L), all members of X may be treated as analytic by means of some other 
extension satisfying (i), (#). Thus: 

DEFINITION 7: Under the hypotheses of Definition 6, T, is reducible to T, iff Ti 

is reducible to T, relative to 9. 


These ideas can be applied to the case where systems L,, L, are the languages 
of empirical theories T,, T, as follows. Suppose that L; (£ = 1, 2) is constructed 
from an already given empirical system L, on the basis of T,. The set T}, as well 


1 Cf. Quine [1960], Chapter IT. 2 Cf. Nagel [1961], pp. 351-8. 
3 A less perspicuous, though perhaps more appropriate, interpretation of the requirement 
of derivabilit} would be: T, S Th( N Mod(Z O TÀ). 
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perhaps as possessing factual content, is to be regarded as a set of postulates 
governing those of its expressions which do not occur as expressions of Lọ. 
Thus the additional expressions are to be understood as far as possible, ñe. 
* without depriving T, of factual content, so as to verify the sentences of T, 
given prior interpretations for the expressions of Lọ, there being no other way 
in which proper denotations of the additional expressions are determined. 
Thus, for each i == 1,2, L; = <P, Si > is to satisfy: 


(Pi) Lie CECL); 
(Pi) AN E Str( L) [Wl, E (La N Mod(T,)|,) > (UEL; <> We Mod (TAN; 
(Pii) ANO Li S Mi; 


where we write |, in place of [yig p, and #,|* consists of those structures M e 
Sir(£;) for which Mi, € 4.1 

The way such systems are to be correlated may be represented schematically 
as follows (where arrows are directed towards conservative extensions): 


L 
myy N 
Ly L 
N A 
Ls 


The specific relationship between L, and L; (i = 1, 2) depends on the choice of 
solution to the problem of separating the factual and conventional components 
of T;. Logical relations between L, and L, likewise depend on this choice. 
When L; is constructed from Lọ and T, on the basis of Solution I we write 
Lie CEh (Lo)? 

Besides the restriction to cases where L, and Lẹ are theoretical extensions of a 
common subsystem Lo, attention will only be given to a special question arising 
in such cases. It is one of special interest, however, from the methodological 
point of view: the question of logical relations between expressions of jointly 
inconsistent theories.? 

Some further preliminaries are necessary. If Ty S Sent( Li), Ta E Sent(#,) 
and sf, E Str( Lo), where WEL) = VE) oa VE), Ti and T, are said 
to be jointly inconsistent with respect to of, (or, when £g = Sir( #5), simply 
jointly inconsistent) if and only if 

A, N Modi T) © Mod(Ts)|) = 9. 


If of, = Mod(A,) for some A, S Sent(£,), it is known that 7, and T, are 
inconsistent in the sense defined only if there is a sentence og such that 


oy E Cn(Ay Y T;) O Sent(L,) 
~o E Cn(A, Y Ta) O Sent(L,). 


1 Williams [1973], pp. pat The question of what conditions should be imposed on Le 
to ensure it an empirical character is discussed in Przetecki [1969], pp. 24-34, 100-2 
and Williams [1974]. 

2 Solution I, ef. Williams [1973], p. 404, is essentially that of Przelecki and Wojcicki [1969], 
p. 383 which, for finitely axiomatizable theories, is equivalent to the solution of Camap 
[x963], pp. 958-66. 

4 The implications of the present approach regarding this question, which he has kindly 
hrought to my attention, were first noticed by Jerzy Giedymin. 
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If y e Form(L) and A is a non-empty set, p4 S P(A?) is to stand for all those 
collections of sequences in A which satisfy m under suitable interpretations of 
non-logical constants: . 
pi = {p": We Sir(L) & W = A}. 
Lastly, by Zr, is understood the language with non-logical vocabulary 
ViL2Pm) = ViL)—V(Lo). Zr is understood correspondingly. 
THEOREM 1: Given Ly, Ly, Lac SSL and T; S Sent(L,) (i = 1, 2) such that 
(a) (Lo) = VF) A Pa); 
(b) Lie CE},(Ly) G= 1, 2); 
(c) T,, Ta are jointly inconsistent with respect to Ay; 
(@) Spedo) = Speco Mod(T ye) (i= 1,2); 
pı E Form( L m) and p; e Form( L ro) are intertranslatable iff: 
@) AW ES [pm = pfl]; 
(#) AW ef, AA E pihl [N E€ Mod(T,)|, > 
> VIG € Mod(T;) (Al = Wy & gi = A)]; 
(#1) AU, EL, AA E pfl [A e Mod(T})|) > 
—> VU, E Mod(Ta) (Walo = Wy & p% = AJ}. 
(Hypothesis (d) is included only for the sake of simplicity. However, it seems to 
be satisfied in all typical cases.) 

Two types of instance of Theorem 1 are of special interest: p, is an atomic 
formula, g, is a sentence. In both cases, it will be assumed for simplicity that 
Aa = Str( L). (The following remarks are easily adapted to the general case.) 
First let p, = Pxg.. . X41 where Pe V( Zm). Then 

p4 = {Rx APO FD: RS AOD, 
Let P% denote the relation correlated with P in the structure W,. Then, from 
(#), p, is translatable into ZL, only if the following condition is satisfied: 
G) AW E Sir Lo) AR S [Al [Wy E Mod(Ty)|o > 
> VU, e Mod(Ty) (Wilo = W & P% = R)]. 
Now it has been questioned whether terms such as P satisfying (j) play a 
a significant role in empirical science. For no empirical findings concerning the 
composition of the proper structures of Lọ can conflict with the decision to 
apply or to withhold P in any given case, even assuming the truth of T}. Such 
terms are indeed amongst those to be considered ‘O-meaningless’ in both of two 
exact senses already proposed.* If the designation of such terms as empirically 
meaningless is granted, it can be stated that, from the standpoint of Solution I: 
(A) No empirically meaningful theoretical term of a consistent empirical theory T, 
ts translatable into the language of a theory T, by means of a theoretical expression 
of Ta tf Tı and T, are jointly inconsistent. 
(This conclusion, and conclusions similarly stated a are subject to all 
limitations set out in the accompanying text.) 


1 Wojcicki [1966]. 
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Suppose now that g; is the sentence o,. Then for any non-empty set A, of is 
either {4°}, {6} or {A°, Ø}. The first two cases, occurring when either o, or 
wa is valid in the domain A are less interesting from the methodological point 
* of view. Assuming then that 

(1) Neither o, nor ~o; is valid in any non-empty domain, 
a; is translatable into L, only if 

(ij) Mod(Ti)lo = Mod(T, U {o4})|p ^ Mod (T1 Y {~o3})lo. 

(It suffices to assume that Spec(T,) S Spee(o,) O Spec(~o) in place of (1).) 
It follows that, from the standpoint of Solution I: 
(B) No theoretical sentence c, satisfying (1) which ts a theorem or whose negation is 
a theorem of a consistent empirical theory T, is translatable into the language of a 
theory T, by a theoretical expressions of T, if T, and T; are jointly inconsistent. 


Concerning sentences satisfying (jj) which are independent of T, the following 
remarks can be made. No empirical findings concerning the composition of the 
proper structures of Lọ can conflict with the decision to assert or to deny o, 
even assuming the truth of T,. Consequently it has been questioned whether 
such sentences can properly be regarded as empirically significant. These 
sentences, in fact, fail to belong to one of the classes that has been precisely 
identified with the class of ‘empirically meaningful’ sentences in one sense of 
that admittedly ambiguous expression! Granted this designation, however, it 
can be stated that, from the standpoint of Solution I: 


(B^) No empirically meaningful theoretical sentence a, satisfying (1) in the language 
of a consistent empirical theory T, is translatable into the language of a theory Tg 
by a theoretical expression of T,, if T, and T, are jointly inconsistent. 
THEOREM 2: Given Ly, Ly, Lae SSL and T; S Sent(L,) (i = 1, 2) such that 
(a) VL) = VL) 0 VL); 
(6) Lie CER) (i= 1,2); 
(c) To Ta are jointly inconsistent with respect to £g; 
(4) Spe Lo) E Spek N Mod(T)lo) (€= 1,2); 
if o€ Sent( L rı) and o, E Sent( L r2), there is a system Le SS: such that 
o, entails o, relative to L iff: 
(6) Spec(%o) N Spec(o,) E Spec(oa); 
(#) Spesa) N Spec( ~oa) E Spec( ~o); 
(ii) Lo N Mod(T, O {~03})lo S Mod (T1 Y { ~o1})lo; 
(i0) La N Mod(Ta V {01})lo S Mod(T, O {04})lo. 
Now if £ = Str(£,) and og satisfies 
(2) oy is not valid in any non-empty domain, 
condition (#1) reduces to 
Mod(Ty)lb E Mod(T, U {~o3})lo- 


+Przelecki [1969], pp. 94-5, whose condition ‘MP S Ver(m)’ has the meaning here of 
REM’. . 


. 
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It follows that, from the standpoint of Solution I: 

(C) No theoretical consequence of a consistent empirical theory T, can entail any 
theoretical sentence o; satisfying (2) in the language of a theory To, if T, and T, arè 
jointly inconsistent, : 
(This also holds if (2) is replaced by: Spec(T,) A Spec{ ~oa) # 9.) 

The question of inconsistency of theoretical sentences of L, and Lẹ is already 
decided by Theorem 2 if ~oz replaces og. Accordingly, from the standpoint of 
Solution I: 

(D) No theoretical consequence of an empirical theory T, can be inconsistent with 
any theoretical consequence of an empirical theory Ta if T; and T; are jointly 
inconsistent and Spec(T,) N Spec(T3) + 9. 

(This result, which is independent of hypothesis (d) of Theorem 2, can be 
strengthened in a number of ways by weakening the new restriction on T, and 
T, This simple condition, however, which states the existence of a cardinal c 
such that T, and T, have each a model of power c, seems almost always to be 
satisfied.) 

Let us now briefly review these questions from the standpoint of Solution II.1 
It will be assumed for simplicity that Spec(T,)  Spec(T,) includes every non- 
zero cardinal and, again, that &, = Sir(£,). So far as intertranslatability is 
concerned the situation is not given exactly by Theorem 2. Nevertheless the 
situation is similar to the extent that conclusion (A) may also be asserted from 
the standpoint of Solution-II. Conclusion (B), on the other hand, can no longer be 
asserted for the following reason. Under the stated assumptions, analytic 
T-sentences of L and L} are exactly the theorems of T, and of T}. It follows 
that all theoretical consequences of T, and T, are intertranslatable. A similar 
result holds for theoretical sentences whose negations are theorems of T; or Tg? 
If, however, analytic and contradictory sentences are excluded from the class 
of empirically meaningful sentences, conclusion (B’) may also be asserted from 
the standpoint of Solution IT. Conclusion (C), however, cannot be asserted since 
every theoretical consequence of Tg, being analytic in Lẹ, is entailed by each 
sentence of L4. (Correspondingly, every sentence of Lg is entailed by the negation 
of any theoretical consequence of T4.) On the other hand, conclusion (D) holds 
as well from the standpoint of Solution II. 

With respect to Solution III,’ the detailed conclusions are more complicated. 
However, in typical cases conclusions (A), (B), (B°) hold equally from the standpoint 
of Solution III. Concerning entailment and inconsistency, on the other hand, 
conclusions (C) and (D) are no longer correct. Inconsistency between theoretical 
consequences o, of J, and o, of T, is now possible, though it is necessary for 
these sentences to axiomatize, respectively, the sets Cn(T1) O Sent( $m) and 
Cn(T,) © Sent( L ro). (The corresponding, but stronger, model—theoretic con- 
ditions are in fact also necessary.) It follows that, from the standpoint of Solution 
IMI, the inconsistency of T, and T; need not be restricted to the common 


1 Q.v, Williams [1973], p. 404. 

2 In view of conclusion (A), however, it is doubtful whether theoretical consequences of 
T, and of T;, or their negations, would remain intertranslatable, from the standpoint s of 
Solution H, with respect to any stricter notion of [iteral translatability. 


> Cf. op. cit., B. 413. 


366 P. M. Williams 


sub-language Lọ. Theoretical consequences of the two theories can be incon- 
sistent, but only if each expresses the full T-content of its respective theory, A 
corresponding conclusion holds for entailment. 

Lastly, it should be remembered that the situations which have been discussed 
above in detail, viz. those conforming to the schema on page 362, are of a quite 
special kind. According to that schema, systems L}, Lẹ include a common 
subsystem Lọ. All additional expressions of L, are introduced on the basis of 
theory T; (i = 1, 2). Now if indeed L, and Lẹ can be correlated by means of an 
extension L in the manner set out before, there certainly exists a common 
subsystem, Lg) say. It cannot be stated in general, however, that the theory T; 
functions as a set of postulates for all its expressions not occurring in Loy 
Some may be provided with direct non-verbal interpretations, whilst others may 
be governed by a postulate set other than T; (e.g. a proper subset of 7). In that 
case, the system playing the part of Ly in the earlier schema with respect to the 
extension by means of postulates will not generally be the common subsystem 
Log but an extension, Ly, say. Similar remarks hold for the expressions of Ty 
In this way we are led to the following schema: 


(2) 
Ly ly 


Z 
Loo L 
~X 
LoLa 
(T9 
Although the detailed conclusions here are too complex to state at present, 
the following remarks can be made. First, unless Ly, and Log coincide, answers 
to the questions raised before no longer have the uniformly negative character of 
(AHD) concerning the earlier schema. Secondly, with regard to the new schema, 
the conclusions appropriate to the various Solutions I-HI, concerning inter- 
translatability in particular, differ more significantly amongst themselves. 
Consequently, an examination of this schema especially will contribute towards a 
comparative appraisal of these solutions. 


P. M. WILLIAMS 
The University of Sussex 
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x Introduction 


The role in scientific theories of theoretical terms, which do not correspond 
directly with observables, continues to be a matter of discussion and some 
puzzlement. If a theoretical term in a particular theory is definable, it can, of 
course, be eliminated directly without altering the semantic content of the theory. 
In order to make room for theoretical terms that are not definable, Ramsey 
([1931], Chapter IX) introduced the notion of eliminability. Roughly speaking, 
a theoretical term is eliminable, in the sense of Ramsey, if all the empirical 
claims of the theory can be made without invoking it. Ramsey apparently 
believed that in a properly formulated scientific theory all theoretical terms 
would be eliminable, but that not all need be definable. Sneed [1971], however, 
has produced examples of simple theory-like systems containing theoretical 
terms that are not eliminable in the Ramsey sense. 

Closely related to this issue is the question of what role a theoretical term can 
play in a scientific theory if it is neither definable nor Ramsey-eliminable. To 
admit into scientific theories terms that do not correspond to observables yet 
which are required in order to make the full set of empirical claims of the theory 
would seem to run counter to the requirements of operationalism. Of course, 
if there were no such terms, there would be no problem, but Sneed’s examples 
would seem to suggest that there are. 


* This work was supported in part by Public Health Service Grant MH-07722 from the 
National Institute of Mental Health. Also by Grant OEG 3-71-01212 from the Com- 
mittee on Basic Research in Education and the Office of Education. We are deeply 
indebted to Professor Raimo Tuomela and Mr Ilkka Niiniluoto for valuable comments on 
an earlier draft of this paper. In particular, Mr Niiniluoto pointed out a serious defect 
in an earlier form of the proof of Theorem 4, which we have now corrected. We are also 
indebted to Dr P. M. Williams for useful advice, some of which we have followed. * 
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The.main aim of this paper is to show that the theoretical terms appearing in 
‘well-formulated’ scientific theories will always be at least Ramsey-eliminable 
even when they are not definable. We will introduce some well-motivated 
restrictions on the class of systems that are to be regarded as well formulated, 
thereby defining a class of FIT (finitely and irrevocably testable) theories. 
We will show that the theories of Sneed’s examples are not FIT, and that 
theoretical terms are always Ramsey-eliminable from FIT theories. Our analysis 
will also show that the FIT ness conditions provide an explanation for an often- 
observed asymmetry between existentially and universally quantified claims of a 
theory. 


2 The Formalisation 


We will proceed as though the scientific theories we are interested in were 
axiomatised in the first-order predicate calculus with equality <L, =>. Of 
course, no complete axiomatisations of significant scientific theories have 
actually been constructed in this formalisation, but Sneed has shown that the 
interesting methodological issues can be raised and discussed in this context. 

We will be concerned with a theory that contains one or more theoretical 
function symbols or predicate symbols; and with structures of observable 
functions or predicates that may be expandable to models of the theory. Testing 
a theory, then, will consist in determining whether the structure describing an 
actual set of observations is or is not expandable to such a model—whether the 
theory ‘holds’ for these observations. A formalisation of these ideas may be 
sketched as follows :1 

We employ a first-order language, L, with equality, containing certain 
function symbols, predicate symbols and constants (0-ary function symbols), 
One or more of the function and predicate symbols, O,,..., Op, is drawn from 
a set O; and one or more, T,,..., Tp from a set T. There is an infinite set, X, 
of constant symbols, x;. We wish to consider a theory, F, in L whose axioms are 
the logical axioms together with the formula F(O, T), which contains symbols 
from O and T. 

Next, we introduce a structure, <D, Om, Tuy consisting of a set, D, of 
elements, and two sets of functions from D to D and predicates in D (call them 
O,, and Ty) corresponding to the two sets of symbols, O and T, in L. To each 
distinct element of D we assign a distinct constant symbol from X. (When the 
ambiguity will not cause confusion we will sometimes write O indifferently for 
Oy and O, and T for Tu and T. It will usually be clear from the context whether 
we are speaking of a theory or a structure.) 

Definition. The observable consequences, H(O), of a theory are the members of 
the class of all consequences of F that do not contain symbols from T. 

Let FO, T) be a closed formula obtained from F(O, T) by replacing each 
free variable in F(O, T) by one of the constants that corresponds to an element 
of D. If all the #,(O, T) constructed in this way are true of <D, Ou, Tu), then 
this structure is a model for the theory F. 


1 In general, we follow the terminology of Shoenfield ([1967]), Chapters a) to which the 
reader can refer for detail. 


: 
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We might be tempted at this point to propose that a structure <D, Ou, ‘Tu 
disconfirms a theory, F, if the structure is not a model of the theory. But matters 
are a little more complicated. In distinguishing the sets Ow and Ty, our intent - 
is to interpret the elements, x, of D as observed objects; the functions of Ox as 
observable functions whose values for any x can be determined by observation of x; 
and the n-ary predicates of Oy as observable relations whose truth-values for any 
set of n observations can be determined from those observations. The members of 
Ty, on the other hand, are theoretical functions and relations, whose values cannot 
be determined directly from observations. Following Sneed ([1971], p. 52), we 
deal with the problem of non-observability by considering the structure <D, Oar), 
obtained by removing the functions and predicates Ty from <D, Ou, Tu). 
(The former structure is called by Shoenfield a restriction of the latter.) We next 
introduce the 


Definition. For a given structure, <D, Ox), iff there exists a Tm in D, and a 
corresponding T in L, such that F(O, T) is valid in <D, Om, Tuy then <D, Om) 
is expandable to a model for the theory F. 


Since the Tw’s are assumed not to be directly observable, we are free to 
expand <D, Om) by choosing any T that makes F(O, T) valid, provided one 
exists. But now, whether a particular <D, Om) is expandable to a model for F 
depends only on the Ox, hence is an observable property of the structure. 
We can say, therefore, that a structure (i.e. a set of observations) disconfirms a 
theory if it is not expandable to a model for the theory. It is with this notion of 
disconfirmation that we shall be concerned here. 

Suppose that all of the symbols of T, in a theory, F, are definable in terms of 
the symbols of O in F(O, T). In this case we can simply use the definitions of 
the symbols of T to eliminate all of them from F(O, T), obtaining the formula 
F’(O), whose observable consequences, H’(O), are identical with the observable 
consequences, H(O), of F(O, T). All models <D, Ox for H(O) (or A’(O)) will 
be expandable, using the given definitions of the symbols of 7, and the corres- 
ponding Tw, to models for F(O, T). In this case, whether we express the 
theory in terms of both the T”s and O’s—as F(O, T}—or in terms of the O’s alone 
—as F’(O)}—is a matter of convenience, not necessity. The presence of the 
unobservable Ty’s raises no issues about the operationality of the theory. We 
are particularly interested, therefore, in the case of theories that contain T’s that 
are not definable in terms of the O's. 


3 Ramsey Eliminability 


In order to introduce the concept of Ramsey-eliminability, we need one additional 
bit of notation. Let M be a set of structures, <D, O, T>. Then we will designate 
by M, the set of structures <D, OY, obtained from M by deleting the T functions 


1 There are, of course, well-known difficulties in identifying relations and functions that 
can actually be regarded as ‘observables’ in any strict sense of the word. Nevertheless, 
in formulating theories it is both customary and convenient to dichotomize the relations 
in this way, and thus to distinguish between the theory proper, on the one hand, and the 
‘auxiliary hypotheses,’ on the other, that are required to connect the ‘observables’ with 
actual physical measurement operations. We will simply follow custom in dividing the, ' 
difficulties in this way. 
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and telations. Thus, to each member of M, there will correspond a subset 
of M. 

In M consider the set of all models for F(O, T). Call this set MT. Restrict 
each member of MT by removing the T relations, obtaining a set, MT, of models 
in M,. Clearly, all members of MZ are expandable to models for F(O, T), for 
we can construct these expansions by reintroducing the T’s that were deleted. 

Next, consider the set of all models in M for H(O). Call this set M°. Again, 
remove the T relations to obtain a set Me, of models in M,. For the same reasons 
as before, all members of M® are expandable to models for H(O). Moreover, 
by the method of their construction, the members of MÌ are all models for 
H(O) in My. Since F(O, T) + H(O), we have proved the 


Theorem 1. MZ < M9. 


The converse of Theorem 1 need not hold—there may be members of M9 
that are not members of MZ, as we shall see presently from examples. 

However, for finite D, the converse of Theorem 1 follows directly from a 
well-known result of Craig & Vaught ([1958]), so we may assert (writing my 
for a model of exactly k elements): 


Theorem 2. For any finite D, all members of M2 are extendable to members 
of MT. Hence we can write Yk(m, e MZ = m, e MQ). 


The conclusion follows from the fact that, for finite k, we can describe any 
model extensionally in terms of the sets of elements of D for which each of the 
members of O holds, and any set of models by a disjunction of such descriptions. 
Hence, in particular, we can distinguish the class M7 in terms of such a dis- 
junction, involving only the O’s. 

We can now introduce a formal definition of eliminability. 


Definition. If, for fixed D, MT = M}, that is, if all models in M, for H(O) 
are expandable to models for F(O, T), then we will say that the theoretical 
relations, T, are D-Ramsey-eliminable. If the relations, T, are D-Ramsey- 
eliminable for all D, then we say, simply, that they are Ramsey-eliminable. 


Ramsey introduced, but did not define very exactly, the notion of elimina- 
bility. The definitions given above essentially follow Sneed (pp. 52-3). They 
appear to be faithful to Ramsey’s original intent in introducing the concept. 

If the theoretical terms of a theory are Ramsey-eliminable, then in a certain 
sense all of the empirical content of the theory can be expressed in terms of the 
observable relations, O. In this case, even if the theoretical terms are not defin- 
able (Shoenfield, pp. 80-1), they are present in the theory as a matter of con- 
venience, not of necessity. Hence, if theoretical terms are always Ramsey- 
eliminable from theories, their introduction for reasons of convenience raises no 
questions of operationality. 

Sneed, however, has proposed two examples of theories containing non- 
eliminable theoretical terms (Sneed, pp. 54-5). But close examination of his 
examples raises the question of whether they possess all the properties that we 
usually associate with scientific theories. It is our contention that they do not, 
and that if we place appropriate restrictions on the scope of the term ‘theory,’ 
- it will be found that theoretical terms are always Ramsey-eliminable ‘from 

theories. ° 
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4 Finite and Irrevocable Testability. 
The conditions we shall introduce are aimed at guaranteeing that anything we 


call a well-formed theory will be operational or testable. We shall be mainly ° 


concerned. with two such conditions: finiteness and irrevocability; and we will 
use these conditions to define a class of finite and irrevocable (FIT) theories. 

The FITness conditions for a theory have nothing to do, directly, with 
whether the theory does or does not contain theoretical terms—they are not 
introduced ad hoc in order to deal with the problem of non-eliminability of 
theoretical terms. On the contrary, we shall see that it is quite natural to impose 
the requirements of FJTness even on theories that are stated entirely in terms 
of observables—that is, theories of the form F(O)—and that theories that fail 
to meet these requirements also fail to satisfy intuitive notions of operationality 
that have been employed by Popper ([1959], Chapter IIT) and others. 

As before, we are considering structures, (D, Ox), that represent sets of 
observations aimed at testing a scientific theory F(O, T). A set of (possible) 
observations, then, comprises a set of observed objects, xe D, and a set of 
functions and relations, Om, on them. The set of observations can be identified 
with a model, m, that is possibly expandable to a model of the theory, F(O, T), 
by introducing a new set of functions and relations, Ty, and the structure 
<D, Ou, Tuy. When we wish to indicate that a model, m, has exactly k elements, 
we will write m,. 

Definition. Consider a model, m,, and a second model, m,,, obtained by 
annexing (zero or more) additional elements to D and extending the Oy’s to 
encompass these new elements and their relations with the original elements. 
We will call the new model, m,4. an extension of my, and will write: my. > My 
Or mM, < Mkt. IE mys > Mm, and My+ Æ My, we write My > my. 


Extending a model is to be interpreted as taking additional observations. 
Two questions now arise with respect to the testability of a theory: 


(1) If the theory is false, do there always exist finite sets of observations on the 
basis of which it could definitely be disconfirmed (i.e. for every structure, D, not 
expandable to a model for F do there exist substructures D,, with finite k, that 
are not in MÌ, hence not expandable to MZ? 

(2) Can the disconfirmation of a theory by some set of observations, m,, ever 
be revoked by taking additional observations, so that mą is not in M9, while the 
extended set, m4, my > mj, is in MQ? 

If the first question can be answered in the affirmative for a theory, we say 
that the theory is finitely testable; if the second question can be answered in the 
negative, we say that the theory is irrevocably testable. More formally: 

Definition. A theory, F(T, O), is finitely testable if 3m(m ¢ M7) and 


Vin[(m ¢ M2) > Am,((m_, < m) A (m, ¢ M2))], for finite k. (1) 
Definition. A theory, F(T, O), is irrevocably testable iff 
Ven[m,((m, <m) A (m, ¢ MZ) > (m ¢ M9). (2) 


Definition. A theory, F(T, O), is FIT if it is both finitely and irrevocably 
testable: 


Yn Smal {te < m) A (m $ MT) = (m $ MP) and Img MP). (3) 


i 
372 H.*A. Simon and G. J. Groen 


How are we to interpret the concepts of finite and irrevocable testability, and 
what justification do we have for equating these concepts with requirements of 


* operationality? We may view the situation as follows: The scientist gains his 


knowledge of how the world is by making a sequence of observations on 
phenomena—that is, observations of objects, of functions of objects, and of 
the satisfaction or non-satisfaction of certain relations among sets of objects. 
Up to any given time, the total set of observations that has been made is finite, 
but it may be enlarged indefinitely by taking new observations. Any denumerable 
sequence of observations whose initial segment agrees with the observations 
taken up to the present time describes a possible world. As new observations 
are made, they eliminate certain of these possible worlds. 

The scientist wishes to formulate hypotheses, or theories that will hold for the 
actual world—not only as it has been revealed by past observations, but as it 
will appear in the light of future observations as well. Hume taught us that there 
is no way to guarantee that a theory consistent with all observations to date will 
not be refuted by future observations. But what of the converse: is there some 
way in which we can guarantee that if a theory is false (t.e. the sequence of past 
and future observations is not a model for it), this will become known ‘to us 
sooner or later by observation? If theories cannot be confirmed, can they, at 
least, be disconfirmed when false? Finite and irrevocable testability are conditions 
that, if imposed upon the class of theories we are willing to entertain, guarantee 
their disconfirmability in a very natural sense. 

Consider, first, what it would mean for a theory not to be finitely testable. 
Then, even if some assertions of the theory were false in the actual world, it 
could happen (i.e. the actual world could be such) that all of the assertions 
would be true for every finite set of observations that could be taken in that 
world. Thus, there would be no way to distinguish, by taking observations, 
between that actual world and a world in which the theory was true. The 
finite testability condition outlaws theories that are untestable in this sense. 
As a matter of fact, it appears to be difficult to construct an example of a theory— 
real or imaginary—that has any sort of surface plausibility, yet is not finitely 
testable. We will offer an example or two later of theories that are not finitely 
testable, but it will be seen that they are forced and ‘artificial’. 

Consider, next, what it would mean for a theory not to be irrevocably testable. 
Then, even if the theory were incompatible with the observations made to date, 
it might be ‘saved’ simply by making additional observations. Thus, we could 
never, by means of observations, refute a theory once and for all; its refutation 
would always be subject to reversal. It is as if, having taken ten observations of 
pairs of numbers that did not all lie on a straight line, we could now observe 
five more pairs and find that all fifteen did indeed lie on a straight line.® 


3? It should be emphasised that in introducing the FITness requirements we are setting 
testability conditions for theories, and not prescribing what is to be done with a theory 
if it is falsified. In particular, to require that a theory be falsifiable, which is the intent of 
the FITness conditions, does not mean that it must be rejected forthwith if falsified. 
‘Irrevocability’, then, does not imply that under no circumstances will we consider 
resuscitating a falsified theory. Rather, it implies that a falsified theory cannot be 
resuscitated merely by taking additional observations and without other alterations. 


*. Thus, the FI Tness requirements are compatible with what Lakatos calls ‘methodological 


> falsificationism’, which permits a falsified theory to be saved, for example, by modifying 
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It is interesting to note that requirements of testability for predicates that are 
quite similar in spirit to the FITness conditions have been introduced into 


automata theory and the theory of perceptrons. The notion of ‘locally testable: 


event’ in automata theory (McNaughton and Papert [1971]) closely approximates 
our concept of ‘FIT theory’, while a corresponding—but slightly stronger— 
concept of ‘conjunctively local predicate’ plays an important role in the theory 
of perceptrons (Minsky and Papert [1969]). In both cases, the motivation for the 
restriction is the same as ours: to insure a certain measure of (one-sided) decida- 
bility when a computational faculty that is capable of handling only finite sets of 
symbols is faced with a potentially infinite sequence. 

To these justifications for the FITness conditions, we can add some less 
formal arguments buttressed by plausible examples. First, equating testability 
with disconfirmability or falsifiability follows the general usage, for which 
Popper ([1959], Chapter IV) has made such a convincing case. Second, the 
FITness conditions provide an explanation for the asymmetry, upon which 
Popper ([1959], especially pp. 68~72) and others have commented, between the 
testability of universal and existential statements.1 

Finally, when applied to simple examples, the FJ7ness conditions draw an 
intuitively plausible boundary between well-formed and ill-formed theories, 
as can be seen from the following illustrations. Note that none of the theories of 
the examples contain theoretical terms; all of the predicates are assumed to be 
directly observable. 


Tr: Unicorns exist. (3xU(%x)). 

T2: No unicorns exist. (Wx 7 U(x)). 
T3: There is a finite number of sunrises. 
T4: There is a finite number of primes. 
T5: The number of stars is prime. 


Theory Tr is not irrevocably testable. A system of k observations containing 
no observations of unicorns cannot be expanded to a model for the theory; but 
extension of the observations to include a unicorn observation produces a new 
system that is a model for the theory. Hence disconfirmation can be revoked 
by extension. i 

Theory T2 is FIT. If unicorns exist, then observations of any single unicorn 
disconfirms T2, and no additional observations of non-unicorns can restore its 
validity. But ‘unicorns exist? must be interpreted as meaning that there is at 
least one unicorn observation in the (potentially infinite) sequence of past and 
future observations. 

Notice that, provided at least one observation is made, T2 implies “There 
exists a non-unicorn.’ The latter assertion, though a consequence of the former, 
is non-testable, because it is not irrevocable. The apparent paradox disappears 
when we recall that testability means disconfirmability. No sequence of unicorn 


the auxiliary hypotheses that connect it with observables. (See the discussions below of 
neutrinos and of celestial mechanics, and the much fuller treatment of these issues in 
Lakatos [1970].) 


1 See also Popper, p. 193 and footnote no. 2 for a comment on what he means by this’ 


. 
-` 


asymmetry. We shall have more to say about it in a later section of this paper. 
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observations can disconfirm the proposition. Hence, a weak consequence of a 
strong and FIT theory may itself not be FIT. 

+ Theory T3 is not finitely testable. If there is not a finite number of sunrises, 
this cannot be shown by a finite set of observations. This is not to say that an 
assertion like T3 may not be derivable as a theorem from a larger theory, but 
simply that the truth of T3 cannot be disconfirmed by making observations of 
dawns to see if they are sunrises. 

Theory T4 is disprovable, from the definition of the primes, but it is not 
finitely testable for the same reason that T3 is not finitely testable. This should 
not disturb us when we recognise that ‘testable’ is intended to mean ‘empirically 
testable’. The proposition that there is a finite number of primes is certainly not 
empirically testable. 

Theory T5 is not irrevocably testable, as can be shown by extending a set of 
observations of k stars (k ¢ Primes) until the size of the set equals the next 
larger prime number. 

It is certainly not unreasonable to exclude T3, T4, and T5 from the class of 
well-formed theories. Moreover, the differences between Tr and T2 is Popper’s 
asymmetry, based on the existential quantifier in Tz and the universal quantifier 
in T2. The criterion of irrevocability rules out Tr but accepts T2. Hence, if we 
find Popper’s condition to be intuitively plausible, we may take it as grounds for 
requiring irrevocability of a theory, and rejecting Tr. Conversely, if we find 
the irrevocability condition to be plausible, we may take it as grounds for accept- 
ing Popper’s condition. 

Let us consider a final example of FITness that is slightly more interesting 
than T2: 

T6: D is a set of objects that can be weighed in pairs on a two-pan balance. 
O(x, y) means that when x and y are placed on the balance, the pan on which x 
rests descends, and the other rises. We now hypothesise that O is a transitive 
relation—t.e, that (x, y, 2)[O (x, y) - Oly, z) > O(x, 2)]. 

T6 is finitely testable, for if the theory is false, there exists a triplet (x, y, 2), 
which violates the transitivity. But if the triplet is taken as the set D, we then 
have a finite model that cannot be expanded to a model for the theory. T6 is 
irrevocably testable, for if a triplet violates the transitivity, any extended model 
containing that triplet will also violate it. 

Finally, it is worth remarking again that none of the examples, TI-T6, 
contain theoretical terms. Hence, the plausibility of imposing FITness conditions 
on the admissibility of scientific theories does not rest on any peculiarities of 
theories that possess such terms. 

In place of FITness, we could introduce a somewhat stronger condition of 
uniform FJTness (UFIT). Suppose that the test for falsifying a theory involves 
the examination of relations among exactly k individuals. In the case of T6, 
above, k = 3. For that theory, if a particular structure of observations is not 
expandable to a model for the theory, then there always exists a substructure of 
three observations that is not so expandable; and conversely, if every triplet of 
observations in a structure is expandable to a model of the theory, then so is the 
entire set. 


- Definition. A theory: F(T, O), is &-testable iff it is FIT and oe exists'a k 
such that (3) is satisfied with the m, of size k. 
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The constant, k, is related to the concept of “degrees of freedom”. That is, 
a k-testable theory may be viewed as having (k—1) degrees of freedom. Suppose 
we have a theory that asserts that a set of points, representing pairs of observables, 
lies, on a straight line: 


This theory is 3-testable, since a set of observations disconfirms it iff some 
subset of three observations does not lie on a straight line. But the theory has 
two degrees of freedom, corresponding to the T-constants, a and b. An axioma- 
tisation of Ohm’s Law that is also 3-testable will be discussed in a letter section. 

Definition. A theory, F(O, T), is uniformly FIT (UFIT) iff it is k-testable for 
some Å. 


5 Testability and Eliminability 

Returning to the general case of FIT theories, we now prove the main theorem. 
We will use (1) and (2) as premises, instead of (3), to show that both the finiteness 
and irrevocability conditions play an essential role in the proof. 


Theorem 3. If a theory, F(O, T), is FIT (i.e., if (1) and (2) are satisfied), then 
its theoretical terms, T, are Ramsey-eliminable; i.e. 


Ym[(m $ MZ) > (m ¢ 19)] (5) 
Proof; Consider any particular m, m ¢ MZF. If the domain of m is finite, then 
(5) holds, hence we need consider only m with infinite domain. Then, by the 
finite testability condition (1): 
Im, [(m, <m) A m, ¢ MZ], k finite. (6) 
Designate by S(O) the complete description of some m, satisfying (6), that is, 
the conjunction of formulas that state for each predicate in O for which sets of 
elements D, this predicate holds, and for which it does not; and for each n- 
argument function of O, its value for each set of n elements in D}. Since k is 
finite, S,(O) is of finite length; and by its construction, S(O) does not involve T. 
Now consider any model, m’, satisfying F(O, T), so that m’ e M7. Construct 
Sz(O) by replacing the distinct individual symbols in S,(O) that designate 
different individuals in D, with distinct variables. Suppose that the following 
sentence, not involving T, were to hold for m’: 


Fx, ..., %(SE(O)); %,...,%, in the D of w. (7) 

Then any set of elements x,,..., p satisfying (7) determine a model, mij, 

described by S¥(O), that is isomorphic with the m, of equation (6). Hence, from 
the second conjunct of (6): 


mi ¢ MZ (8) 
By the irrevocability condition, (2), m, ¢ MZ —> m’ ¢ M7, so that, from (8): 
m ¢ Mg, (9) 


contradicting our assumption that m’ e MZ. Hence the latter assumption, or its 
equivalent—that m satisfies F(O, T)—is incompatible with (7), so that 


. F(O, T) > Vx, ... SEO) * (10) 
But the right-hand side of (10) does not involve T, hence belongs to H(O). 


° 


a(x) = a+ty(x) (4) 


e 
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By definition, m e M? means that H(O) is true of m. Since (7) holds for m, but 
contradicts the right-hand side of (10), it follows immediately that m ¢ MQ. 
Q.E.D. 

Since Sneed has provided two examples of theories from which the theoretical 
terms are not Ramsey-eliminable, it will be instructive to discover in what 
respects these theories are not FIT. His first example is this (Sneed, p. 54): 


T7: F(T, O) = (i) VeVy(T(«, y) -> 3z0(x, 23) A T1320(y, 2)) A 
(i) Yx(3z0(x, 2) > I[T (x, y) A Vo(T(«, w) > w = yA 
(ii) Yem 0(x, 2) > Ay[T(y, x) A Vo(T(w, x) > w = y))) 


Of this he says: ‘Sentences (i), (#), and (#2) are true exactly in models <D, T, O> 
in which there is a one-one correspondence between individuals which stand in 
the first place in the O-relation and those which do not. Yet it can be shown that 
there is no sentence, containing only the predicate O and identity which is true 
exactly in models <D, OY in which there is such a one-one correspondence.’ 

Notice of T7 that, given any system of observations <D, O» with O(x, y) such 
that one-half of the elements stand in an O-relation to some y, and half do not, 
and none of the y’s stand in an O-relation to some z, then this system can be 
extended to a model for F(T, O). It is easy to see that the theory is finitely 
testable. However, it is not irrevocably testable. For suppose we observed an 
odd number of elements such that the conditions above could be satisfied by 
adding one more, appropriate, element. Then, since the conditions can be 
satisfied only by an even-numbered set of observations the initially observed set 
of elements disconfirms the theory, while the extended set does not, contrary to 
condition (2) of the previous sections. 

Of this theory, Sneed says (pp. 54-5): ‘One might raise a question as to whether 
the fact that there is a one-one correspondence between individuals which stand 
in the first place of the O-relation and those which do not, is an observable fact. 
If one takes “observable fact” to mean “fact expressible in the observation 
vocabulary’, then it clearly is not. This is just the force of the example. On the 
other hand, it is obviously a fact about the O-relation. It is just the same sort of 
fact about the O-relation as that O is transitive. If one, in some sense, discovers 
by observation whether or not individuals stand in the O-relation, then there is 
no difference in the way one would check to see whether some particular O- 
relation had either of these properties. The point is that there are some facts 
about O-relations that cannot be expressed by sentences containing only the 
O-predicate.’ 

These statements are correct if we take ‘observable’ to mean ‘finitely testable’, 
but not if we take it to mean ‘finitely and irrevocably testable’, The theory that 
the O-relation is one-one is finitely, but not irrevocably, testable; while the 
theory that it is transitive is both finitely and irrevocably testable. Hence the 
latter theory is FIT, while the former is not. 

Sneed’s second example is contained in his observation (p. 55) that in ‘an 
axiomatization of the theory of ordered fields for the first-order predicate 
calculus with identity... for any sentence containing only the order-relation 
predicate and identity, there will always be models which cannot be... 
[expanded] . . . to produce models for the full theory. That is to say, intuitively, 
the notion of an ordered field cannot be fully characterized by sentences in the 


Ld 
s 
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first-order calculus with identity, containing (besides identity) only the-order- 
relation predicate.’ 

The difficulty here is the same as was encountered in the first example— 
the condition of irrevocability fails. It is easy to construct a system that is not an 
ordered field, and cannot be expanded to one, but which can be extended by 
addition of appropriate elements into a system that can be so expanded. Take as 
an example of the former system a completely ordered discrete set with a least 
element (e.g. the positive integers). 

The most appropriate way, therefore, to deal with Sneed’s examples of 
theories from which theoretical terms are not Ramsey-eliminable, is to deny that 
they are admissible theories by imposing the requirement that theories be FIT. 
If this requirement is imposed, then the theoretical terms are always Ramsey- 
eliminable. 


6 Theories Making Existential Claims 

We have not stated precisely what we mean by the distinction between theories 
making universal claims and theories making existential claims. Dr P. M. Williams 
has made an observation (in a personal communication with the authors) that 
permits the distinction to be drawn in simple terms. He points out that the class 
of structures, M7, of Theorem 3 will satisfy the condition (1) and (2) iff H(O) is 
equivalent to a set of universal prenex sentences. If we now say that a theory 
makes existential claims if the set of all its observable consequences is mot so 
equivalent, then it follows immediately that a theory is FIT iff it does not make 
existential claims. 

Our interest, however, lies not in this formal relation between FIT ness and 
the universal-existential dichotomy, but in the implications for the practice of 
science of excluding theories that make existential claims in this sense. If one 
can point to examples of such theories that are in good repute in science, then 
serious doubt will be cast on the appropriateness of the FIT ness conditions. 

The most striking examples of actual theories that appear to make existential 
claims are the theories in physics that assert that certain kinds of elementary 
particles exist. The assertion that neutrinos exist is an important case, and one 
we shall discuss specifically. More recent examples, involving exactly the same 
issues, are theories asserting the existence of particles of anti-matter of various 
kinds. 

No issue would arise if the statement ‘Neutrinos exist’ were simply derived as 
a consequence, employing theoretical terms, of some theory. The problem is that 
‘neutrino’ is used by physicists as an observable term, not just a theoretical one. 
That is to say, if certain events are observed in a cloud chamber or bubble 
chamber experiment, these events are interpreted as equivalent to the observation 
of a particle of a certain kind. In experiments performed to test whether 
neutrinos exist, an affirmative answer would require the observation of neutrino 
events. The question before us is whether the failure to observe neutrino events 
would disconfirm the assertion that neutrinos exist. If not, it is not clear why 
the experiment should be performed at all. Under these circumstances, if 
‘Neutrinos exist’ is treated as a complete theory, it is not FIT; if it is not a 
complete thesry, but is a consequence of other propositions, then the experiment 
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can stil not disconfirm it, hence cannot disconfirm the propositions from which 
it is derived. If an assertion belonging to H(O) is, when regarded as an indepen- 
dent theory, not FIT, it may be harmless, but it cannot contribute anything to 
the testability of the theory. 

Let us now turn to the celebrated experiment that was performed to test the 
existence of neutrinos. The critical question to raise about this experiment is 
what conclusion would have been drawn if no neutrinos had been detected 
(they werel). The experiment was carefully designed so that the density of 
neutrinos (if they existed with the properties that had been postulated of them) 
would be sufficiently high so that they would not remain undetected if present. 
If they had not been detected in the original experiment, but were subsequently 
detected by augmented observations, the disconfirmation would not have been 
revoked. On the contrary, a new theoretical problem would have been created— 
to explain why they were not observed in the original set of observations. Thus, 
the theory being tested was a stronger theory than ‘neutrinos exist.’ It was more 
nearly ‘neutrinos exist in such numbers and with such properties that in any set 
of k observations under these conditions there will be some neutrino observa- 
tions.’ The theory reformulated in this way is FIT. It also makes a universal, not 
an existential, assertion. 

The experiment could also be explained by arguing that what was being 
tested was not ‘neutrinos exist’, but ‘no neutrinos exist’. As we have already seen 
from the case of the unicorns, the latter theory is FIT. The observation of 
neutrinos disconfirmed irrevocably the theory that ‘no neutrinos exist’. However, 
this interpretation of the experiment appears less satisfactory than the one 
proposed in the previous paragraph, for it does not explain why this experiment 
was regarded as critical, while earlier experiments, where the (predicted) 
probability of detection was lower, were not so regarded. 


4 Axioms for Physical Systems 


In a previous paper, one of us (Simon [1970]) used Ohm’s Law as an example 
to show how a physical theory could be axiomatised by defining an appropriate 
set-theoretical predicate, Ohmic circuit. The definition was given by: 


T8: I'is a system of Ohmic observations iff there exist D, r, c, such that: 
(1) P= Dn c); 
(2) D is a finite, nonempty set; 
(3) r and c are functions from D into the real numbers; 
(4) for all x e D, r(x) > o and c(x) > o. 
T” is an Ohmic circuit iff there exist D, r, c, b, and v such that; 
(5) r= <D, r,¢, b, D); 
(6) T = <D,r, c> is a system of Ohmic observations; 
(7) v and b are real numbers; 
(8) for all x e D 


(a) e(x) = Frl) é 


l 
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In this system, r and ¢ are to be interpreted as observables; b and v, as theo- 
retical terms. The theory is finitely testable, since, in any structure that is not a 
model, there exist substructures, m3, with values of {c,(x), r,(x)} that do not 
satisfy (a). It is irrevocably testable, for a set {c,(x), r(x)} will not satisfy (a) if - 
any of its subsets fail to do so. In fact, it is k-testable, with d = 3, and hence is 
UFIT. 

In the earlier paper, it was shown (p. 20) that the theoretical terms, v and b, 
could be made definable in the sense of Tarski by adding the following require- 
ment to the definition of Ohmic circuit. 


(9) D contains at least two members with distinct r’s and c’s. (When this 
condition is satisfied, we call the system an identified Ohmic circuit.) 


However, including in the axioms of a system a stipulation of the minimum 
number of observations is awkward. The previous paper proposed. to avoid this 
difficulty by substituting a notion of general definability for Tarski’s notion of 
definability (Simon [1970], p. 22). We now see that if we are satisfied with the 
Ramsey-eliminability of theoretical terms, rather than their definability, the 
objectionable axiom (9) is not needed in any case. Ramsey-eliminability is weaker, 
however, than general definability, for the former does not require that the 
theoretical terms be uniquely determined, as is demanded by the latter. 


8 Testability of Classical Celestial Mechanics 


Does classical celestial mechanics—the conjunction of the Three Laws of Motion 
and the Law of Universal Gravitation—constitute a FIT theory of the motions 
of the solar system? The answer depends on how we interpret the elements of 
the set D. (See Simon [1970], pp. 23-26.) 

Suppose that we interpret an element, x, of D to be a set of observations on 
the positions, momenta, and accelerations of a definite set of planets—namely, 
all the known planets at a point in time. Then we can test whether these observa- 
tions satisfy the theory, e.g. whether the total momentum and angular momentum 
of the system is conserved, whether the inverse square law holds for accelerations, 
and so on. If the laws fail to hold for some set of observations then they are 
falsifiable by a finite subset of these observations (Simon [1947]), and no new 
set of observations of these same planets would revoke the falsification. Hence, 
interpreted in this way, classical celestial mechanics is a FIT theory. 

There is, however, an alternative interpretation that has historical significance. 
We interpret an element, x, of D, to be a set of observations on the orbits of a 
fixed set of planets. Again, such an observation provides a finite test for the 
theory, but now a falsification can be countered not simply by extending the 
observations numerically but by redefining what constitutes an observation— 
that is, by discovery of a new planet and its orbit. As we know, the theory has in 
fact been ‘resuscitated’ several times in exactly this way. 

Two points about this process should be noted. First, planets are discovered, 
not invented. To explain a deviation of observed from predicted orbits, one must 
discover and track a new point of light that itself has an appropriate orbit. The 
enumeration of the planets in the system may be regarded as one of the theory’s 
auxiliary hypotheses that determine what is an observation. But these hypotheses 
are not arbitrary. The theory, as generally interpreted, does not allow mass to 
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be attributed to points in space where a planet cannot be detected optically (or 
by some other non-gravitational means). ‘Invisible’ planets are outlawed. If the 
density of observations were sufficiently great it could even be said with assurance 
that no planets with mass greater than some constant n exist having such and such 
orbits. (The inference would require that there be known limits upon the 
minimum luminosity and maximum density of objects.) By such means, the 
theory could be made to approach a condition of irrevocable testability. 
Second, this analysis—and this paper in general—avoids the question of 
what it means for a theory to be approximately correct. ‘Small’ discrepancies 


. between observation and theory are often regarded not as falsifying the theory 


but as grounds for treating it as only approximate. We cannot undertake to 
develop this point here, but we must not forget its relevance to the broader issues 
of the testability of theories (see Simon [1968], pp. 439-443). We also eschew 
discussion here of issues of probability. 


9 Conclusion 


In this paper we have reexamined the question, raised earlier by Ramsey and 
Sneed, of the eliminability of theoretical terms from theories. First we imposed 
conditions of finite and irrevocable testability upon the class of systems to be 
regarded as well-formed theories. It was shown that these restrictions could 
be motivated quite independently of the question of eliminability of theoretical 
terms, for they are applicable to theories that contain no theoretical terms, 
We proved that theoretical terms are always eliminable from theories that are 
finitely and irrevocably testable; and we defined an important class of theories 
that are uniformly finitely and irrevocably testable. We showed that an axioma- 
tisation of Ohm’s Law previously proposed is a FIT theory. Finally, we applied 
our analysis to clarify two concrete situations of historical importance in physics: 
tests of the existence of neutrinos, and the consequences for celestial mechanics 
of the discovery of new planets. 
HERBERT A. SIMON 
Carnegie-Mellon University 
GUY J. GROEN 
Carnegie-Mellon University 
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APPARENT CONFLICTS BETWEEN QUINE’S INDETERMINACY 
THESIS AND HIS PHILOSOPHY OF SCIENCE * 


The Indeterminacy Thesis. 

Apparent Inconsistency with Quine’s Philosophy of Sctence. 
Quine’s Reply. 

Difficulties with Quine’s Reply. 

Quine Defended. 


x The Indeterminacy Thesis 

In ‘Two Dogmas of Empiricism’ Quine argued that no one had yet clarified the 
notion of synonymy, and suggested that if anyone ever does so, the clarification 
will ‘presumably [be] in terms relating to verbal behavior’ ([1953], p. 24). 
Similarly, he begins Word and Object with the assertion that the notions of mean- 
ing, synonymy, and proposition can be made sense of, if at all, only in terms of 
verbal behaviour—more specifically, in terms of verbal responses to stimuli. His 
reason is that in learning languages and in teaching them to others we depend in 
an essential way upon the relations between a speaker’s utterances and the stimuli 
affecting him (p. ix). The argument in Chapter Two of Word and Object is then 
supposed to reinforce the attack on analyticity and synonymy in ‘Two Dogmas’ 
by proving that the notion of meaning cannot be clarified in terms of verbal 
behaviour, from which it then follows that it cannot be clarified at all. The 
argument is that there may be many mappings of one language into another, or 
of one into itself, which are compatible with verbal behaviour and yet not intui- 
tively equivalent with each other. It follows that a translation’s compatibility 
with verbal behaviour does not assure preservation of meaning in the intuitive 
sense; and from this it follows that we could never clarify what meaning in the 
intuitive sense is. Accordingly, questions about the correctness of a translation, 
except a limited few which are settled by verbal dispositions ([1960], p. 68), do 
not concern any objective matters of fact concerning which one could be right or 
wrong ([1960], p. 73; [1969], p. 29). An important corollary is that there is no 
justification for positing propositions. For we cannot be justified in positing 
entities of a certain kind if we do not possess clear criteria for their identity and 
distinctness; and since propositions are presumably identical just in case they 
are the meanings of (can be expressed by) synonymous sentences, it follows that 
we lack such criteria ([1960], pp. 200, 206). 

What Quine means by saying that a translation is compatible with verbal 
behaviour may at first seem obscure. But from his remarks (ibid. pp. 68, 71) 
summing up the possible results of querying speakers under varying circum- 
stances, it is apparent that he construes such compatibility as the satisfaction of 
the following four constraints: 


(x) Each observation sentence? is translated into an observation sentence with 
which it is stimulus synonymous. 


* Part of this material appears in my doctoral dissertation Indeterminacy of Translation, 
Quantum Logic and Necessary Truth, Harvard University, 1971. I wish to thank W. V. 
Quine, I. Scheffler, F. Mondadori, F. Thompson, T. Tymoczko, J. Thomason, J. L. 
Mackie, J. Giedymin, P. M. Williams and J. Worrall for helpful discussions and suggestions 

1 See Quine [960], Chapter Two, for explanations of terminology used in this discussion. 
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(2) Auth function in one language is translated into the corresponding truth 
function in the other. 

(3) Stimulus analytic or contradictory sentences are translated into sentences 
which are likewise stimulus analytic or contradictory. 

(4) Non-observational occasion sentences are translated into others which are 
stimulus synonymous for each bilingual. 


Very roughly, what these conditions amount to is the following: (r) if most 
members of a speech-community A are prompted to assent to a given sentence S 
by the same class C of stimuli, then S should be translated into a sentence S” of 
- another speech community B most of whose members are prompted by C to 
assent to S’. (2) is reasonably self-explanatory, given Quine’s behavioural 
criteria for recognising foreign truth functions. (3) requires that if a sentence 
commands nearly universal assent (or dissent) in community A, its translation 
should do so in community B. (4) requires that a bilingual be prompted to 
assent to (or dissent from) a sentence or its translation by the same stimuli. 

Clearly, the most convincing way to support the thesis that there are non- 
equivalent translations satisfying these constraints would be to produce examples. 
Quine makes only a gesture in this direction in Word and Object (p. 72), with an 
example relating to rabbits, which has aroused a great deal of sterile controversy. 
His only argument there was that one could translate a certain native sentence as 
‘Is that rabbit the same as the other?’ or as ‘Is that rabbit stage a stage of the 
same animal as the other?’, since assent and dissent would be expected in the 
same circumstances on either translational hypothesis. Accordingly, Quine 
concluded, it is meaningless to ask whether the word ‘gavagai’, which occurs in 
the native sentence, means ‘rabbit’ or ‘rabbit stage’. 

Understandably, many readers found this argument inadequate to bear the 
weight Quine seemed to place on it. While perhaps two different translations of 
‘gavagai’ and the surrounding construction could be accommodated for this 
particular sentence, Quine made no attempt to show that they would fit verbal 
behaviour in connection with all other sentences involving this word or this con- 
struction. Thus, it was reasonable to complain that the indeterminacy seemed to 
arise only because Quine arbitrarily refused to allow the translator to consider 
additional verbal behaviour or some other kind of evidence (Chomsky [1968], 
pp: 38-9). 

More recently, Quine (see, for instance, his [1969], pp. 35-45) has remedied 
this defect in his argument by producing several much more plausible examples. 
First, there are certain three-word phrases in Japanese—schematically, ‘0 A O’ 
—which may be translated according to two plans. The first takes ‘OA’ as a 
numeral suitable for counting objects of the kind corresponding to ‘O’, and 
‘©’ to be true of individual objects of that kind: thus, for example: 


OAO +> five cows. 


The second takes ‘Œ’ as a numeral, ‘O’ as a singular (mass) term true of the 
scattered totality of live beef, and ‘A’ as a term for dividing up the mass (like 
‘sticks of’, ‘grains of’, or ‘head of’ in English): 

DAO + five head of cattle, 


1 Quine ([1969], p. 104) has more recently observed that there is some indeterminacy i in 
ndng the ‘corresponding’ truth function. 


Quine’s Indeterminacy Thesis and his Philosophy of Science 383 


with ‘cattle’ construed as a mass term. The two translations of the phrase as a 
whole are stimulus synonymous, even though they are obtained by assigning 
different objects to ‘O’. Japanese friends tell me there is no reason to regard one 
translation as correct and the other not. . 
Again, Quine has proposed several examples of indeterminacy in the translation 
of parts of English into other parts. For example, should an atomic symbol be 
identified with the set of its inscriptions, or with its Gödel number? In either 
case, such syntactical terms as ‘concatenation’ (i.e. ‘y’) can be defined so that 
such syntactical laws as 


x == g whenever «ny = zeny 


are satisfied. 

Similarly, he suggests, we can follow Frege in defining n as the set of 
n-membered classes, or von Neumann in defining it as the set of its predecessors. 
In either scheme, we can define associated predicates and operations in such a 
way that all the laws of arithmetic are satisfied. There therefore seems to be no 
reason to regard Frege as right and von Neumann wrong, or vice versa. 

In such examples as these, we have to imagine that before their own adoption 
of some translation scheme or other, native speakers are merely puzzled by such 
questions as ‘Is a symbol its Gödel number?’ and ‘Is 3 a member of 4?’ Other- 
wise, their answers to such questions would support one translation over another. 

With these three examples, Quine seems genuinely to have established the 
indeterminacy thesis, which we could only regard as speculative when we had 
just the rabbit example before us. Not only is it highly plausible that entire trans- 
lation schemes incorporating the above pairs of alternatives could be devised so 
as to satisfy the constraints (r)-(4); but it also is difficult to imagine any other 
sort of evidence which could select one from each pair as uniquely correct. 
Accordingly, these examples seem well suited as a response to Chomsky’s claim 
that the indeterminacy arises solely because Quine artificially limits the evidence 
which the translator may consider. 


2 Apparent Inconsistency with Quine’s Philosophy of Science 

However, Chomsky ([1968]) has raised another issue which seems critical to 
understanding and evaluating the indeterminacy thesis. The problem is that 
Quine describes the relation of data to theory in science in almost exactly the 
same way as he describes the relation between the data and hypotheses of trans- 
lation. Yet it is only in the latter case that he concludes that there are no objective 
matters of fact for the hypotheses to be about. 

Consider again Quine’s discussion of translation. He has attempted to show 
that alternative linguistic theories (specifically, translations) are compatible with 
the relevant data (specifically, information concerning stimulus meanings), and 
has concluded that in all but a few instances of translation there are no objective 
matters of fact at issue. The linguist observes a certain amount of verbal 
behaviour, settles upon stimulus meanings by means of inductive generalisation, 
and is thereby enabled to locate truth functions, stimulus analytic sentences, and 
pairs of stimulus synonymous occasion sentences. And relative to this informa- 
tidn—a summary of relevant observations, past and future, actual and Poetics 
a manual ofstranslation is underdetermined. 
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Whatis puzzling to Chomsky and many other readers of Word and Object is 
that Quine describes the relation of theory to data in science and ordinary life in 
almost exactly the same way: 


The incompleteness of determination of molecular behaviour by the 
behaviour of ordinary things is hence only incidental to this more basic 
indeterminacy: both sets of events are less than determined by our surface 
irritations ... past, present, and future. ([1960], p. 22.) 


And yet Quine does not conclude from this under-determination that it is no 
objective matter of fact whether or not tables or even neutrinos exist. The reason 
* is that Quine, like everyone else, has a particular conceptual scheme which he 
‘takes seriously’ and within which he judges ‘earnestly and absolutely’ ([1960], 
pp- 24-5). He therefore has a ‘fully realistic attitude toward electrons and muons 
and curved space-time . . . despite knowing that [current physical theory] is... 
underdetermined’ (Quine [1968], p. 303). 

The puzzle here is this: if under-determination of physical theory by its data 
does not imply that physics fails to deal with objective matters of fact, why should 
the under-determination of translation by its data imply that translation does 
not deal with objective matters of fact? Or—to relate translational indeterminacy 
to the existence of propositions—if the fact that certain questions about pro- 
positions (e.g. their identity and diversity) remain unsettled by all possible 
observations implies that propositions do not exist, then why doesn’t the fact that 
certain questions about electrons are correspondingly undecidable by observa- 
tion imply that electrons do not exist? Why, to put it yet another way, can’t we 
view a translation manual as something like a scientific theory which, taken as a 
whole, has certain testable implications (concerning relations of stimulus mean- 
ings) but some of whose sentences (specifically, analytical hypotheses) are not 
testable in isolation? 

It looks, then, as if Quine is arbitrarily applying a reductionist criterion of 
meaningfulness in linguistics which he certainly would not wish to apply in 
physics. That is, it may appear that Quine is demanding that the notion of a cor- 
rect translation must, in order to be used in statements of objective fact, be 
defined explicitly in terms of the evidence in its favour—namely, relations be- 
tween stimulations and verbal dispositions. He would not make the analogous 
demands for the notions of electron and wave function. 

In addition to this reductionism, similar in spirit to phenomenalism and 
operationalism, Quine’s views on translation seem to commit him to a second 
old-fashioned ‘ism’!: a kind of conventionalism analogous to that of Reichenbach. 
Conventionalist tendencies emerge in his discussion of his claim that one of the 
reasons many persons have failed to perceive the indeterminacy is that practicing 
linguists adhere to certain implicit maxims which narrow down the choice of 
translations. For example, they try (ï) to avoid translating widely accepted native 
sentences into English sentences which are obviously false ({1960], p. 59). They 
also attempt (#) to make the translational mapping preserve length to the greatest 
extent compatible with constraints (r)-(4), and (##) to exhibit part-by-part 
1 To forestall a possible misunderstanding: I do not, of course, intend the term ‘old- 

fashioned’ as some sort‘of argument against the views so labelled. It is only intended fo 


point up the irony involved in Quine’s recent reliance upon positivist views which he is 
famous for having rejected long ago, 
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correspondences insofar as possible. These latter two maxims make them ifkely 
to rule out e.g. ‘undetached rabbit part’ as a translation of ‘gavagai’. 

Such maxims are not capricious, for they may be supported by obvious 
psychological and practical considerations: (#) is supported by the consideration 
that (a) it is less likely that large numbers of persons should believe something 
absurd than it is that a single translator should be mistaken, (#) reflects the view 
that (b) speakers are likely to use short terms to refer to objects (like rabbits) to 
which they have occasion to refer relatively often, and to construct from these 
words longer terms (like ‘undetached rabbit part’) to refer to objects to which 
they have less occasion to refer. (ti) may be supported on the pragmatic ground 
that a translation scheme, in order to be serviceable, must be systematic in the 
sense that the translations of phrases and sentences should be constructible in 
fairly uniform ways from the translations of their parts ([1960], pp. 74-5). 

(a) and (b) have the look of empirical hypotheses which might be proposed on 
the basis of common-sense psychology and borne out by the investigation of 
languages. But Quine holds that the investigation of languages is irrelevant here. 
His reasoning is that we cannot verify that the ‘laws’ (a) and (b) hold without 
adopting some methodological principles beyond (z)-(4), and that once we have 
adopted (i) and (i) the ‘laws’ follow immediately. He therefore thinks it better 
not to view them as ‘substantive laws of speech behaviour’ but rather as mis- 
leadingly phrased methodological maxims ([1960], p. 74). The principle ‘correct 
translations conform to (¢)(#)’ is regulative rather than constitutive, a conven- 
tion rather than an empirical hypothesis. Laws such as (a) and (b), which are 
obtained as a consequence of adopting this regulative principle, therefore share 
its conventional character. 

The parallel with Reichenbach’s geochronometric conventionalism is quite 
striking. Suppose we wish to establish that light travels with the same speed in 
both directions along a certain path. Then, argued Reichenbach ([1927], pp. 
125-7), we shall have to have synchronised clocks at the end-points of the path 
to determine whether the two travel-times are equal. But the only way to syn- 
chronise clocks is by sending a signal of known velocity between them. However, 
we cannot determine any velocities unless we already have synchronised some 
clocks, which we cannot do unless we already know a velocity. Because of this 
circularity, Reichenbach argued that we must simply adopt it as a convention— 
not as a (verifiable) empirical hypothesis—that light travels with the same speed 
in both directions. Other conventions are less convenient but cannot be regarded 
as false. 

We can nowsee the parallel to translation as viewed by Quine. Suppose we wish 
to reject a proposed translation of someone’s remark on the ground that persons 
like him could not possibly hold a belief as absurd as the translation attributes to 
him. But in order to know how absurd the beliefs of persons like him are, we 
shall have to know how to translate their remarks, which we cannot do unless we 
can reject some proposed translations on such grounds as that they attribute 
excessively absurd beliefs. Hence, according to Quine, we must simply adopt it as 
. an undefended maxim not to translate in such a way as to attribute excessively 
absurd beliefs. We therefore hold that people tend not to believe things which are 
` too -absurd—but only as a convention, not as an empirical hypothesis which 
might turn oug to be either true or false. And if we use this convention to choose 


a; 
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between translation schemes satisfying constraints (z)~(4), that choice will be 
conventional rather than correct or incorrect ([1960], pp. 59, 69, 77). 

: It is quite clear what view Quine’s philosophy of science commits him to 
taking of Reichenbach’s geochronometric conventionalism. He would have to 
agree that the hypothesis that light travels with equal speeds in both directions 
along a path cannot be verified without making certain other assumptions—e.g. 
about the present condition of and laws obeyed by one’s clocks. But he would say 
this has no tendency to show that the hypothesis is conventional rather than 
empirical. After all, ‘our statements about the external world face the tribunal of 
sense experience not individually but only as a corporate body’ ([1953], p. 41). 
That light travels with velocity c in all directions and in all frames of reference is 
an assertion we have excellent reason to regard as true, not because we have, per 
impossibile, tested it in isolation, but because we have strongly corroborated the 
total theory, special relativity, of which it is a part. 

Again, we should not be mis-led by the fact that convenience—+.e. simplicity of 
the total system of laws—plays a role in our decision to say the to-and-fro speeds 
are equal. This fact, Quine would say, does not make our decision the adoption 
of a convention. The convenience of any hypothesis—even the hypothesis that 
there are trees and rocks—in enabling us to predict future experience in the light 
of past experience plays a role in our decision to accept it (ibid. p. 44). But what- 
ever our reasons for doing so, if we accept a theory, we accept it as irue, and do 
not regard it as a mere convention (cf. [1960], pp. 24-5). 

By applying Quine’s general philosophy of science to Reichenbach’s position, 
we obtain a criticism which is (I think) correct in its essential respects. One thing 
that must be said on Reichenbach’s behalf, however, is that he is at least consist- 
ent. Whenever two theories are both compatible with the evidence, he says the 
choice between them is a matter of convention, of choosing the simpler of 
equally true ‘equivalent descriptions’. Thus, since he thinks it is possible to 
formulate a theory, complicated but consistent with observation, according to 
which trees vanish when unobserved, he concludes that it is a matter of conven- 
tion whether unobserved trees exist ([1944], p. 19). 

Whatever its other virtues, Quine’s position lacks this thorough-going con- 
sistency. His philosophy of linguistics differs in all sorts of unexplained ways 
from his philosophy of physics. If the statements of a physical theory cannot be 
tested except in conjunction with each other, the theory still has meaning as a 
whole. But if we cannot translate a certain singular term without assuming a 
translation of certain predicates ([1960], p. 72; [1969], p. 47), then the correct- 
ness of either translation is ‘objectively ae ciaee Similarly, since without 
assuming some translation schemes, we cannot verify that a relatively short term 
like ‘gavagai’ is more likely to refer to an enduring physical object than to related 
fusions, stages, or universals, this general rule is not a genuine empirical hypo- 
thesis ([1969], p. 34). And if we accept one translation rather than another simply 
because it is more convenient, we ought not for that reason to regard it as right 
and the other wrong ([1960], pp. 73-5). 

Seeing no justification for these apparently arbitrary distinctions between . 
physics and linguistics, Chomsky concluded that the indeterminacy thesis is true, | 
but for the uninteresting reason that most theories in any field are underdeter- 
mined by their data: è 
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It is quite certain that serious hypotheses . . . will “go beyond the evidence”. 
If they did not, they would be without interest. Since they go beyond mere 
summary of data, there are competing assumptions consistent with the 
[actual and possible] data. But why should all of this occasion any surprise 
or concern? ([1968], pp. 66-7) 


3 Ouine’s Reply 
Quine had heard this line of criticism before the publication of Word and Object, 
and therefore included this response to it in the book: 


May we thus conclude that translational synonymy at its worst is no worse 
off than truth in physics? To be thus reassured is to misjudge the parallel. 
... One is always working within some comfortably inclusive [physical] 
theory, however tentative. ... By contrast, we are always ready to wonder 
about the meaning of a foreigner’s remark without reference to any one 
set of analytical hypotheses, indeed even in the absence of any. ([1960], 
pP. 75-6). 
This argument does not require much comment. Obviously, the fact is that we do 
have standard translation schemes we are usually prepared to apply to the re- 
marks of most foreigners. There are imaginable circumstances under which we 
would alter these schemes or construct new ones from scratch; but the same is 
true of physical theories. In any case, Quine’s comments—whether true or false— 
have no apparent bearing on the question at issue: why does under-determination 
of a theory by its data imply, in the case of linguistics but not in the case of 
physics, that the theory concerns no objective matters of fact? 
The thus far unstated assumption underlying everything Quine has said on 
this matter finally emerges in his reply to Chomsky: 


Though linguistics is of course a part of the theory of nature, the indeter- 
minacy of translation is not just inherited as a special case of the under- 
determination of our theory of nature. It is parallel but additional. Thus, 
adopt for now my fully realistic attitude towards electrons and muons and 
curved space-time, thus falling in with the current theory of the world 
despite knowing it is in principle methodologically underdetermined. 
Consider, from this realistic point of view, the totality of truths of nature, 
known and unknown, observable and unobservable, past and future. The 
point about indeterminacy of translation is that it withstands even all this 
truth, the whole truth about nature. This is what I mean by saying that, 
where indeterminacy of translation applies, there is no real question of right 
choice; there is no fact of the matter even to within the acknowledged under- 
determination of a theory of nature. ([1968], p. 303.) 


Quine is saying that translation is indeterminate, not only relative to any sup- 
porting data (concerning verbal behaviour), but also relative to something called 
our current ‘theory of nature’. He therefore thinks we have a salient difference 
. between physics and translation: physics, though under-determined by its own 

data, certainly does not remain underdetermined once a theory of nature has 
` been selected. For physics is (at least) part of our theory- of nature. In the 
paragraph preceding the one just quoted, Quine says there is no first philosopliy 


` 
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higher ‘than physics, and later that ‘no higher standard offers’ than the theory 
of nature. Evidently, then, he wishes to identify the so-called theory of nature 
with physics. The thrust of Quine’s argument is therefore that since there are 
several translational schemes compatible with all possible relevant data and 
compatible with physics, it is not an objective matter of fact which of such 
translations is correct. 

That this is his view is even clearer in a more recent paper ([1970]). He says 
that the indeterminacy thesis is essentially the claim that if two physical theories 
A and B are compatible with all data we ever will or could get, and we have more 
or less arbitrarily ‘adopted A for ourselves, we cannot infer from it whether to 
translate the foreigner as believing A or as believing B’. To be undetermined by 
the choice of a physics, then, is to be indeterminate. 

It is now apparent that in addition to reductionism and conventionalism, 
Quine’s view rests upon a third old-fashioned ‘ism’—namely, physicalism: the 
thesis that the only real facts are physical facts, that is, facts about the motions of 
particles and facts about fields, potentials, wave functions, etc. which we use to 
systematise the former facts.1 This theory dates from the early Vienna Circle 
days, but was asserted by Carnap more recently: 


All laws of nature, including those which hold for organisms...and... 
societies, are logical consequences of the physical laws, t.e. of those laws 
which are needed for the explanation of inorganic processes. ([1963], p. 883.) 


The view, then, is that since physics deals with everything there is and tells us 
approximately where it is and what it is doing at any given moment, the other 
sciences, which deal with the same stuff except in larger bundles and under 
different names, must be derivable from the laws of physics. To the extent that 
they are not, given some re-naming to permit formal derivation, they are mean- 
ingless; their truth cannot be an objective matter of fact. (This is a view to which 
Quine has subscribed in conversation.) Where linguistic phenomena are con- 
cerned, the only real facts are that certain patterns of radiation strike the sensitive 
surfaces of human organisms and that they occasionally emit certain sounds 
related to these patterns in complicated ways: 


What we objectively have is just an evolving adjustment to nature, reflected 
in an evolving set of dispositions to be prompted by stimulations to assent to 
or dissent from sentences. ([1960], pp. 38 f.) 


And if a translation is indeterminate relative to our physical theory and these 
‘objective’ (physical) facts, there is no question of its being right or wrong. 


4 Difficulties with Quine’s Reply 

The issues Quine raises here, while difficult, are probably not beyond adjudica- 
tion. We should first notice that Quine gives us no reason to believe that physic- 
alism, upon which his entire argument is based, is true. This may be because 
when one is in a certain frame of mind, physicalism looks obvious. Is it not | 
obvious that physica deals with everything there is, and tells us a as we 


+ This line of ce aed derives from a suggestion made by Saul Kripke i in conversa- 
tion. 
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could ever want to know about it? There are two aspects to this question, an 
ontological and an ideological. 

First, do we know that the ontology of physics exhausts the universe, so that if 
some theory posits some objects—e.g. propositions or super-egos—not posited ` 
by physics, we can forthwith reject that theory as false or meaningless? T'he 
difficulty with this question is that it pre-supposes that there is such a thing as the 
ontology of physics, and this view ignores the obvious fact that what the objects 
and laws of physics are is subject to change through the course of time, and to 
disagreement at any given time. An argument to the effect that the postulation of 
propositions is illegitimate, because ‘physics’ does not mention them and hence 
settles no questions about e.g. their identity and diversity, lacks specificity— 
since it is unclear what physics is meant.1 That changes in physics may actually 
affect the argument is made quite clear by recent proposals by workers in the 
foundations of quantum theory to include—of all things—propositions in the 
ontology of physics. They have argued that axioms concerning a certain system 
of propositions constitute a better starting point for quantum theory than the 
standard Hilbert space axioms.” 

Certain of Quine’s remarks, however, suggest that he may have in mind a thesis 
which is at least specific. Quine urges us to abandon the notion that philosophy 
provides the ‘foundations’ of common-sense and scientific knowledge while 
operating in some ‘cosmic exile’ where no such knowledge is taken for granted. 
He recommends rather that we philosophise from the standpoint of ‘the best 
[conceptual scheme] we know—right down to the latest detail of quantum 
mechanics, if we know it and it matters’ ([1960], pp. 4, 275 f.). Assuming that 
Quine is following his own advice, and in order to give his physicalism a more 
specific content, let us suppose that Quine thinks, for some reason, that all the 
‘real facts’ are embodied in the standard physical theories of today (‘right down 
to the latest detail of quantum mechanics’), and that these theories posit nothing 
but particles and a large variety of sets (functions, real numbers, etc.). Now 
suppose that some psychologist proposes to explain a certain kind of behaviour 
by positing some sort of ‘hypothetical construct’—a super-ego, for example. 
Would it be a good argument against this psychologist’s theory to say, ‘We have 
already chosen our physics, and it does not make any provision for super-egos. 
So they do not exist, and any statements you make about them cannot concern 
any objective matters of fact’? This would be sheer dogmatism. After all, Quine 
says he would even posit the gods of Homer if doing so would be ‘efficacious . . . 
as a device for working a manageable structure into the flux of experience’ 
([1953], p. 44). Would he not, then, be willing to postulate super-egos if doing so 
would explain why people have certain inhibitions? Clearly, the only good 
argument against the psychologist in question would be that whatever he is try- 
ing to explain with super-egos can be explained better without them. The mere 
fact that statements about super-egos are not decided by the choice of a physics is 
no argument whatever against them. And yet it is a precisely parallel argument 
which Quine uses against statements about propositions and correct translations. 
- Let us now be even more charitable and assume with Quine that any theory 
. with an ontology more extensive than that of today’s orthgdox physics can be 


1 This proble has been emphasised in discussions of inter-theoretic reduction by 
Scheffler ([1950]), Nagel ({1961]), and Hempel ([1969]). * Cf, Gardner [1971]. 
`~ 


i 
390 Mickael R. Gardner 


rejected out of hand. Are we then committed to saying that if any statement is not 
implied by the physical facts, as determined by this physics, it cannot be true? 
‘This question recalls the ideological aspect of physics, mentioned two para- 
graphs back. As Quine would be the first to point out, the ‘ontology of a theory 
stands in no simple correspondence to its ideology’—that is, to the ideas which 
can be expressed in it ([1953], p 131). If the quantificational variables of physics 
range over everything there is, it does not follow, of course, that the predicates of 
physics suffice to define the predicates of each true statement, much less that the 
laws of physics conjoined with certain other statements of physical fact suffice to 
imply all true statements. In addition, it is unclear on what basis one could select 
a particular historical stage of physics as that whose predicates allegedly con- 
stitute a universally adequate language. 

The extraordinarily broad and undefended claim to which Quine’s defence of 
the indeterminacy thesis commits him is a corruption of his project in Word and 
Object of specifying a canonical language for science. He claims there that ‘all 
traits of reality worthy of the name can be set down’ in a canonical idiom con- 
sisting solely of variables, predicates, truth functions, and quantifiers. He makes 
it clear, however, that this doctrine specifies only the form of the canonical 
language; ‘of itself it sets no limits to the vocabulary of unanalyzed general 
terms admissable to science’. It only ‘sets limits to the ways of deriving complex 
predicates . . . from these undictated components’. He even raises the question 
whether ‘we may not still aspire to the discovery of some fundamental set of 
general terms on the basis of which all traits and states of everything could in 
principle be formulated’, and shows that this aspiration is vulnerable to a paradox 
similar to Grelling’s (Quine [1960], pp. 226-32). 

Apart from this logical consideration, it certainly seems an unpromising enter- 
prise to select a list of predicates from some current physical theory, and then try 
to define all other meaningful predicates in its terms. We have decades of experi- 
ence with translating a huge variety of sentences into quantification theory, and 
thus can reasonably be confident of its adequacy for representing a very wide 
class of sentences and inferences, But we do not have many examples of success- 
ful translations into physical terms of, e.g. ethical, aesthetic, and psychological 
terms. And in the absence of these, we cannot have adequate grounds to believe 
that all, e.g. ethical, aesthetic, and psychological truths can be deduced from 
physical facts and theories, even if we could somehow settle which facts and 
theories to regard as ‘physical’, 


5 Quine Defended 


The physicalist, conventionalist, and reductionist arguments Quine uses to 
support the indeterminacy thesis are unfortunate, not only because they conflict 
with certain obvious truths and Quinean doctrines, but also because they obscure 
his real point about the notion of meaning. Let us look back over the history of 
this discussion. In his [1953] Quine claimed that we lacked a satisfactory explica- 
tion of ‘meaning’ and that if one is forthcoming, it will presumably relate to 
verbal behaviour. In his [1956], Carnap took up this suggestion, claiming that an 
hypothesis about meaning can be tested by asking a speaker whether he would . 
apply a word in Aous circumstances described by the experimenter in such 
terms as those of size, shape, and colour (pp. 233-47). : 
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It was Quine’s insight to see that Carnap’s proposal—or any other: which 
attempts to explain meaning in terms of the relations of utterances to stimuli—is 
in principle inadequate to solve the problem he posed in [1953]. Words do not ir 
general relate to the world in such a simple and direct way that their function is 
exhausted by an account of their application to observable and describable 
objects. Some words, of course, refer to objects which are too small to be observ- 
able, and some to objects which are unobservable because abstract. There are 
likely many incompatible theories about the unobservable which are compatible 
with all we could ever observe. Since our beliefs are not fully determined by 
what we observe, we are to some extent unable to determine what a man believes 
by determining what stimuli have affected him. It follows that we are at least to 
that same extent unable to determine how to translate the sentences he utters in 
response to those stimuli (Quine [1960], pp. 64, 78), assuming that we have only 
the relations between stimuli and responses to go on. Quine supplements this 
general argument with some specific examples of translations between which no 
decision seems possible on the basis of verbal behaviour or, indeed, any other kind 
of evidence (cp. x, above). 

Now is there, as some of Quine’s critics have maintained, really no cause for 
surprise or concern here? Well, surely it is a defect of a theory of meaning such as 
that sketched by Carnap that in many cases it provides no way to determine 
whether two words have the same meaning. It would obviously be a defect of a 
theory of motion that it provided no way to determine if two particles have the 
same velocity. 

It might be replied, in a Chomskian vein, that since there is no non-circular 
definition of velocity as the outcome of a specified kind of test, there will be in- 
compatible ascriptions of velocity consistent with our test results. But we do 
not conclude from this that ‘velocity’ is meaningless, but rather that operation- 
alism is false.1 And neither ought we to draw the corresponding conclusion that 
‘synonymy’ is meaningless when it turns out that that property is not definable in 
terms of the evidence for it. 

The error in this reply consists in overlooking the crucial difference between 
these two cases. Though velocity is not, by any non-circular definition, a certain 
test result, still a dynamical theory such as classical or quantum mechanics does 
provide a general framework within which tests of velocities do provide determin- 
ate answers. Assuming the correctness of our mechanical and electromagnetic 
theories and of certain statements about our instruments, our experimental tests 
yield us definite answers to questions about velocities. The fact that there are 
probably alternative dynamical theories compatible with these test results does 
not change the fact that the theory we actually accept makes them tests of some- 
thing determinate. 

The same cannot be said of the sort of semantical theory sketched by Carnap. 
He does not provide a set of background assumptions relative to which questions 
about the meanings of words have definite answers—e.g. the question whether 
‘mass’ means the same in Newtonian and in relativistic mechanics, or whether 
von Neumann’s or Frege’s definitions of ‘number’ are correct. The point is not 
that other, incompatible background assumptions are consistent with verbal 
` behaviour data. The point is that a theory like Carnap’s prdi@des no adequate set 


1] presuppose*here Putnam’s arguments against operationalism contained in his [1970]. 
`“ 
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of background assumptions whatever. The suggestion that we settle questions 
about meaning by probing dispositions to respond verbally to descriptions or 
stimuli is inadequate, Quine has shown. It leaves many questions unanswered, 
since it is unaccompanied by any general theory analogous to the electrical and 
mechanical theories of the physicist, which enable him to interpret the responses 
of his instruments. Unlike Quine, I see no reason to insist that the semanticist’s 
background theory be physics, so long as it is an empirical theory in the light of 
which verbal dispositions can either confirm or refute semantical hypotheses 
left indeterminate by theories proposed thus far.t 
Chomsky’s and my criticisms notwithstanding, it seems clear that in his 
[1960] and [1969] Quine significantly strengthened his early ([1953]) attack on 
the notion of meaning. In the early essay, he was able only to show the inadequacy 
of certain explications which had in fact been proposed. In the more recent work, 
he has been able to exhibit some questions about meaning which seem undecid- 
able by any evidence anyone can think of, and has produced a general argument to 
show that a particular kind of evidence—verbal responses to stimuli—is inade- 
quate. This ought to intensify the scepticism about meaning of anyone moved 
by appeals to a verification principle, or of anyone who thought meaning must 
lie in, if anything, verbal responses to stimuli. Where it seems to me that Quine 
went too far was in claiming not only that a certain approach to the theory of 
meaning is inadequate, but in claiming further that the theory of meaning has no 
subject matter since it is undetermined by the only real facts, the physical 
facts. 
MICHAEL R. GARDNER 
Mount Holyoke College 
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CONFIRMATION AND THE DUTCH BOOK ARGUMENT* 


x In 1955 papers by Kemeny, Shimony and Lehman appeared which purport 
to establish the probability axioms as the axioms of confirmation.t The three 
articles are adaptations of the Dutch Book Argument (DBA) from its original 
subjectivist context. No argument in support of ‘c= p’ has found such wide 
acceptance as the DBA, and these articles have been widely cited. The aim of 
this paper is to re-inforce existing objections to the DBA and to propose a 
new one. 


2 The commonsense foundation for the DBA is that it is irrational to bet for a 
sure loss. T'he concept of coherence is defined in order to express this idea as a 
condition on betting quotients (BQs). 
DEFINITION: A set of BQs on related propositions? is coherent iff there are 
no stakes for which they give a sure loss. 


Using the definition of ‘coherence’, the structure of the DBA can be given in 
a very simple form. 


Premise 1. Rationality Premise (RP): BQs are rational only if coherent. 
Premise 2. Dutch Book Theorem: BQs are coherent only if a probability. 
Conclusion: BQs are rational only if a probability. 

The adaptation for degrees of confirmation is as follows: 


Degrees of confirmation are rational BQs, 
$0, 
Degrees of confirmation are a probability. 


The substantial part of the DBA is that in support of Premise 2, the Dutch 
Book Theorem. In Kemeny’s argument a crucial move is made, which appears 
in two different forms. It appears firstly in what is assumed about betting against, 
rather than on, a proposition, and secondly in assumptions about ‘reversed bets’. 
Firstly a bet is supposed to be made against some h, when the BQ given, is for 
a bet on h. Secondly, having considered a set of bets, Kemeny considers what he 
calls ‘the reversed case’, the reverse that is, of the entire set of bets. In both 


* From a thesis in partial fulfilment of the requirements for the degree of M.A. (Hons.) 
at the University of Sydney. I am grateful to Mr D. C. Stove of the University of Sydney 
for patient and helpful criticism. A paper on this topic was presented to the conference 
of the Australasian Association for Philosophy, August 1970, at Sydney. Professor D. 
Gasking’s comments at that time put me on, I believe, the right track. 

1 The three papers all appeared in the Yournal of Symbolic Logic, 20. They are: Kemeny 
[1955], Lehman [1955], Shimony [1955]. Unless otherwise stated, references are to these 
papers. 

* Hereafter I shall write simply ‘BQs’ for the whole expression Wet of BQs on related 
propositions, . 
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cases the gambler is supposed simply to have stepped into his partner’s shoes. 
In other words it is assumed that if a gambler is willing to bet on A at odds of 
pS: S—pS, he is willing to bet on not-h at odds of S—pS : pS. This assumption, 
which I call bet reversal, is essential to Kemeny’s Theorem. 

The papers of both Shimony and Lehman allow the stake S, of a bet, to range 
over negative and positive values. To receive, or to give, a negative sum has a 
perfectly clear meaning; a bet on A with a ‘negative’ stake is a bet against h; 
negative stakes merely reverse the bet. To allow, as do Shimony and Lehman, 
the stake to range over both positive and negative values, is to treat as indifferent 
to the argument which side of the bet is taken. Use of the negative range of 
values for S plays the same role in their arguments as does Kemeny’s use of bet 
reversal in his. Negative stakes/bet reversal are of a piece, and the justification 
for use of either is not at all obvious. 


3 The central concept of the DBA is that of coherence, and a striking difference 
between particular versions is the occurrence of prima facie inconsistent defini- 
tions of ‘coherence’. The definition I have given, following de Finetti, Shimony, 
Carnap and others, is: 


BQs are coherent if there are no stakes for which the gambler is bound to 
lose. 


This fits exactly Ramsey’s original idea of ‘a measure of consistence, namely 
such a consistency between the odds acceptable on different propositions as 
shall prevent a book being made against you’.1 We could call this a definition of 
‘weak coherence’, the qualification being used to distinguish it from a stronger 
notion: 


Strong coherence: BQs are strongly coherent if there are no stakes for which 
the gambler is either bound to lose, or bound to win.® 


It is not difficult to see that use of strong, rather than weak coherence in the 
theorem would account for the use made of bet reversal. Since strong coherence 
prevents a sure win, it prevents a sure loss for one’s partner. So strongly coherent 
BQs safeguard against a sure loss even if some or all the bets are reversed. It 
seems then, that ‘coherence’ must be taken in the Theorem in its strong sense, 
and this will provide a licence for the use of bet reversal/negative stakes. To avoid 
equivocation, therefore, ‘coherence’ must be so taken in RP. Consistency demands 
that Premise T be stated as Strong RP: 


Strong RP: BQs are rational only if there are no stakes for which the gambler 
makes either a sure loss or a sure win. 


I believe that one factor at work which has aided the acceptance of the DBA 
is that RP is introduced as though only weak coherence were meant. The 
opening two paragraphs of Lehman’s paper are a good example of this. In an 
article elsewhere Kemeny? gives this brief definition: 


1 See Ramsey [1926], p. 8- 

* The distinction here introduced between strong and weak coherence is not to be confused 
with the inane tikes coherence and strict coherence. For the purposes of this 
paper the latter di ion is irrelevant, and | shall take no account of it. 

3? Kemeny [1963], p. 720. ° 
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A set of BQs is coherent if for all stakes they give a sporting chance of 
winning. 
Shimony’s explanation of coherence is that X’s beliefs are incoherent if ‘one 
may propose a set of stakes such that in betting for these stakes in accordance ` 
with his evaluations, X is bound to lose, ...’. Despite these presentations as of 
weak coherence, the precise definitions give strong coherence, and strong RP. 


4 On what grounds should we accept strong RP? 

The foundation of the DBA, and the only support for RP is the commonsense 
desire to avoid a sure loss. But in this principle alone there is no support for 
strong RP, which also enjoins the avoidance of a sure win. One bets normally 
in the hope of a win; why not for a sure win? Strong RP cannot be supported 
simply by pointing to the obvious undesirability of a Dutch Book against us. 
Strong RP appears, prima facie, to be a perverse addition to weak RP. 

While strong RP is implausible in an obvious way, weak RP is dubious in a 
less obvious way. It is an injunction of extreme caution. In order to justify the 
injunction we would have to suppose that our partner would attempt to take 
advantage of us by setting stakes for which he was bound to win, and further, 
that we would be silly enough to accept. Justification looks still more difficult 
when the argument is removed from a conventional gambling situation to a 
general theory of rationality, which is supposed to involve ‘betting with nature’. 
We would need to make what Lakatos calls a Manichean assumption that 
‘an evil power will catch us out by a shrewdly arranged system of bets’.1 
(Lakatos quotes Putnam who seems to have first noted this weakness of the 
DBA.) Shimony, in a 1967 article,® also draws attention to the extreme strength 
of weak coherence: 


In spite of the elegance of de Finetti’s theorem, however, the argument 
in toto has a loophole; a person may be unwilling to have a Dutch Book 
made against him and yet willing to forego the automatic protection 
against this contingency which laying bets in conformity with the laws of 
probability provides. 


Reasons can be given then, for holding both weak and strong RP to be false. 
The irrationality of a sure loss just does not lead clearly to any restriction on 
BQs alone. 


5 There is, I believe, a basic misunderstanding which largely accounts for the 
general acceptance of the DBA. The misunderstanding arises from the well- 
established usage in which degree of confirmation is said to give the fair betting 
quotient. 

By fair betting quotient is meant the BQ for which the bet is fair to both 
partners. It should be realised that this must be made consistent with, e.g. 
Carnap’s assertion that such a fair bet (a bet with a ‘fair-to-both’ BQ), can be 
‘subjectively unfavourable to both partners’.? Let us agree that degrees of 
confirmation are fair-to-both BQs. For many reasons, size of stake, size of 
fortune, it must be allowed that a bet with a fair-to-both BQ is not always a 
reasonable bet to make. The property needed for RP is th@very strong one that 


1 Lakatos [1968], p. 360. 2 Shimony [1967]. 3 Carnap [1950], p. 274. 
` 
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degrees of confirmation are, not just fair-to-both BQs, but that they are 
sufficient conditions for the bet’s being acceptable and reversible. This goes far 
beyond the commonly held view that they are ‘fair BQs’. Acceptance of c-values 
as fair BQs (in the quite usual sense), does not support bet reversal as used in 
the DBA. 

Kemeny chooses oddly when he replaces previous writers’ ‘coherence’ with 
‘fairness’. He claims that ‘fair’ is a useful word because it is suggestive. It is 
indeed suggestive, in a misleading way. ‘A fair bet’ is an expression best avoided 
in the present context. A bet’s having a fair-to-both BQ has no entailment 
about acceptability and reversibility of the bet, for the BQ is only part of the 
description of a bet. The BQ may be fair, yet the bet unreasonable for one side 
or the other. 

A closely similar situation exists with respect to subjectivist acceptance of the 
DBA. A man’s degree of belief, it is asserted, is measured not just by some 
acceptable odds, but by the minimum odds he will take. The same sort of 
misunderstanding is operating here. A man may accept bets with BQs far below 
his degree of belief, the ‘fair’ BQ, because the risk is very small, or because he 
is a man who loves to gamble. Or again, the lowest odds he will take may be 
not as low as those which match his degree of belief, because he takes into 
account the size of the stake. In a word, because of considerations of stake, 
fortune, temperament, a man’s betting behaviour does not uniquely mark out 
his degree of belief. Since acceptance of a bet depends not only on the stake, but 
on a whole complex of beliefs, attitudes, and emotions, the whole betting 
behaviour measurement of belief faces formidable difficulties. 

Reversibility is also assumed to be part of the meaning of a measure of belief. 
Degree of belief is measured, it is asserted, by those odds where the bet is just 
‘balanced’, where the risk of either partner’s losing is evened out. Hence, if one 
side is acceptable to a man, so must the other be. But to define belief in this way 
is to build into it the negation axiom of probability. Unless we wish to assume 
what we are attempting to prove, we have to allow that for a rational man there 
may not be any odds for which he would take either side indifferently. The 
betting situation is being used to prove something about degrees of belief, 
and we cannot assume that either actual behaviour, or ideal rational behaviour, 
follows an analogue of the probability axioms. The DBA purports to rest on one 
sole rationality requirement—the avoidance of a sure loss. To assume that there 
are, or ought to be, rationally reversible BQs, is to build in another assumption. 

I see no possibility of supporting anything as strong as RP, strong or weak. 
In deciding the acceptability of a bet, the rational thing to do is to weigh all 
the factors, stake, BQ, one’s fortune, and so on. In deciding the rationality of a 
set of bets, an eye should naturally also be kept open against the possibility of a 
Dutch Book. But to be prepared to bet solely because of the BQ, is an irrational 
behaviour pattern, whether the BQ is equal to the degree of confirmation, of 
belief, or whatever. There seems to be no other way of supporting a premise as 
strong as Strong RP, and so there appears to be no possibility of founding an 
argument for degrees of confirmation, or degrees of belief, as rational BQs, on 
Strong RP. 

# PATRICIA BAILLIE 
University*of Auckland 
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NAGEL’S TRANSLATION OF TELEOLOGICAL STATEMENTS: 
A CRITIQUE 


In recent years there have been many attempts to explicate the ‘meaning’ of 
teleological statements in biology. Some of the analyses were undertaken in order 
to prove that the use of these statements is ‘congruent with the spirit of modern 
science’. For it was once feared that the expressions which are said to charac- 
terise teleological statements—‘such typical locutions as “the function of”, “the 
purpose of”, “for the sake of”, “in order that”, and the like...’ (Nagel [196x], 
p. 403}—had undesirable connotations. It is now generally agreed that these 
statements are thoroughly scientific and that ‘teleological... statements in 
biology . . . neither assert nor presuppose . . . either manifest or latent purposes, 
aims, objectives, or goals’ (p. 402). But even though their anthropomorphic 
innocence is widely acknowledged, the use of teleological statements in biology 
remains controversial. Disputes now center on their role in the reduction of 
biology to the physico-chemical sciences. 

In Chapter 12 of The Structure of Science, Ernest Nagel argues that the use in 
biology of expressions ‘signifying a means-end nexus’ does not indicate the pre- 
sence, in that science, of a distinctive type of explanation. For, he claims, state- 
ments which contain these expressions (teleological statements) can always be 
reformulated ‘without loss of asserted content’ into statements which do not 
contain them (nonteleological statements). He says, for example, that the teleo- 
logical statement “The function of chlorophyll in plants is to enable plants to 
perform photosynthesis’: 


appears to assert nothing that is not asserted by “Plants perform photo- 
synthesis only if they contain chlorophyll”, or alternatively by “A necessary 
condition for the occurrence of photosynthesis in plants is the presence of 
chlorophyll’’. These latter statements do not explicitly ascribe a function to 
chlorophyll and in that sense are therefore not teleological formulations 


(P. 405). 
Nagel contends that the ‘asserted content’ of these teleological and nonteleo- 
logical statements is equivalent: ‘the initial unexpanded statement about 
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chlorophyll appears to assert nothing that is not asserted by (the nonteleological 
formulations} (p. 405). He maintains that the content of the former is ‘fully 
conveyed’ by the latter, and that the difference between the formulations is 
merely ‘one of selective attention, rather than of asserted content’ (p. 405). 

This example, Nagel says, may be taken as a ‘paradigm’; he then presents his 
general translation schema: 


when a function is ascribed to a constituent element in an organism, the 
content of the teleological statement is fully conveyed by another statement 
that is not explicitly teleological and that simply asserts a necessary (or 
possibly a necessary and sufficient) condition for the occurrence of a certain 
trait or activity of the organism (p. 405). 


Here Nagel claims that a teleological statement of the form (71) “The function of 
X is Y’ can always be replaced by an equivalent nonteleological statement of the 
form (T2) ‘Y only if X’ (or, alternatively, ‘X is necessary for Y’). 

The suggestion is, then, that the equivalence of the teleological and nonteleo- 
logical statements is not an empirical matter. For, according to Nagel, the equiva- 
lence of Tr and T2 can be seen by an examination of the statements themselves— 
by an analysis of their ‘meaning’ or ‘asserted content’. This means, presumably, 
that once we have the evidence to assert the teleological statement, T1, no 
additional evidence is required to assert the nonteleological statement, T2. Of 
course, the whole point of Nagel’s argument requires that the equivalence be 
independent of empirical evidence, since his express purpose is to show that 
teleological statements are always replaceable by equivalent nonteleological ones. 
If the equivalence were dependent on a case by case examination of empirical 
data, then the strongest claim that Nagel could make would be that sometimes 
teleological statements of the form Ti are replaceable by nonteleological state- 
ments of the form T2—but that only empirical investigation can tell us when the 
replacement can be made. A claim of this sort, however, would threaten to 
reintroduce the spectre of that special sort of teleological explanation that it is 
Nagel’s intention to eliminate. 

Two sorts of questions must be asked about Nagel’s analysis. First, is the 
general translation schema correct? Are Tx and T2 always equivalent in asserted 
content, or are there counter-examples? Second, is it true that the equivalence of 
the formulations is independent of empirical considerations? Does the equiva- 
lence depend on ‘asserted content’ alone? 

Consider Nagel’s translation schema: he contends that Tr, “The function of 
X is Y’ and T2, ‘Y only if X’, are equivalent in meaning. The only difference 
between them, he says, is ‘one of selective attention’ (p. 405). However, there does 
seem to be an important difference between these statements which amounts to 
more than ‘selective attention’. Tx leaves open the possibility that something 
other than X could produce Y; T2, however, explicitly precludes that possibility. 
Nagel and others are aware of this difference, but Nagel claims that it is not 
significant (and therefore that it does not affect his equivalence claim) and the 
others seem to have accepted his view. Nagel acknowledges that the teleological 


1 Morton Beckner, in The Biological Way of Thought, notes that the translation (T2) of the 
teleological statemefit (T1) is not identical with Tr in asserted content: ‘T2 says much 
more than Tr. For even if it is true that chlorophyll is necessary for photosynthesis, Tx 
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statement “The function of chlorophyll in plants is to enable plants to perform 
photosynthesis’ does not preclude the possibility of an alternative mechanism, 
while the allegedly equivalent nonteleological statement ‘A necessary condition 
for the occurrence of photosynthesis in plants is the presence of chlorophyll’ 
does preclude an alternative mechanism. This difference, he argues, is irrelevant 
for the following reason: even though the existence of a green plant that performs 
photosynthesis without chlorophyll is logically possible, such a plant does not, in 
fact, exist. Merely logical possibilities, then, are irrelevant: 


teleological analyses in biology ...are not explorations of merely logical 
possibilities, but deal with the actual functions of definite components in 
concretely given living systems (p. 404). 


In other words, teleological statements of function are equivalent to nonteleo- 
logical statements of necessary conditions tf, by necessary, we mean factually 
rather than logically necessary. 

Nagel’s argument can be rephrased in the following way: it is true that teleo- 
logical statements of the form Tr leave open the possibility of alternative 
mechanisms, while nonteleological statements of the form T2 do not. But the 
statements are still equivalent. For although it is logically possible that some 
factor Z could produce Y just as X does, nevertheless, as a matter of actual fact, 
only X does so. Yet this clearly means that the evidence for Nagel’s equivalence 
claim is empirical: his contention that Tı “The function of X is Y’ and T2 ‘Y 
only if X’ are equivalent rests on the empirical claim that there exist no cases in 
which some Z (different from X) produces Y. And this, as we shall now see, is 
false. 

Consider the following example. It is true to say that ‘Bone marrow functions 
to produce blood cells’, but it is false to assert either that ‘Bone marrow is neces- 
sary for the production of blood cells’ or that ‘Blood cells are produced only if 
bone marrow is present’. For if the bone marrow is destroyed or diseased, the 
spleen and liver take over the function of producing blood cells in the body.t 
Here we have a case in which the supposedly correlative statements “The function 
of X is Y’ and ‘Y only if X’ are clearly not equivalent. Y is not a necessary 
condition for X because, as a matter of actual fact, there is an alternative, Z. In 
this instance, then, the nonteleological statement cannot replace (and does not 


does not state that it is true.’ (Beckner [1968], pp. 129-30.) But Beckner completely 
misses the significance of this point, for he then goes on to say: “The elimination (of 
teleological statements) might be carried out by pencil-and-paper operations, in which 
appeal is made only to a translation schema, such as the one suggested by Nagel. For 
this one needs only knowledge of syntax and semantics.’ (Beckner [1968], p. 131.) Here 
Beckner clearly implies that Nagel’s translation schema relies solely on syntax and seman- 
tics. Yet if T2 differs from Tı in the way that Beckner says it does, one needs to have 
additional evidence in order to replace Tı by T2. Nagel’s schema transforms a function 
statement into a statement of necessary conditions. The latter is stronger than the former, 
and requires stronger evidence. Nagel’s schema merely purports to rely on syntax and 
semantics alone; but in fact, to effect the translation, empirical data are needed. 

1 It is true that in the normally functioning body, bone marrow alone functions to produce 
blood cells. For the spleen and liver to assume this role, the situation must be extra- 
ordinary. Nonetheless, the statement ‘Bone marrow functions to produce blood cells’ is 
true under all circumstances (normal or abnormal), whereas the stafement ‘Bone marrow 
is necessary for the production of blood cells’ is not. i 
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translate) the teleological one, because the former is false while the latter is true. 
Thus the translation schema is inadequate. 

. We turn now to our second question: is it true that the ‘equivalence’ of the 
formulations follows from their ‘asserted content’ alone and that empirical 
evidence is irrelevant? To the contrary, empirical data are crucial. For Nagel 
defends his claim that the formulations are equivalent not by analysing the mean- 
ings (‘asserted content’) but by citing matters of fact. Let us look closely at Nagel’s 
argument. 

He claims that the teleological statement “The function of chlorophyll in 
plants is to enable plants to perform photosynthesis’ is equivalent to the non- 
teleological statement ‘Plants perform photosynthesis only if they contain 
chlorophyll’—despite the fact that the latter precludes the possibility of an alter- 
native mechanism, while the former does not. How does he defend his equiva- 
lence claim? Not, as we might expect, by examining the meanings of the state- 
ments, but by citing empirical data: 


It is certainly logically possible...that processes in living organisms 
produce starch without requiring chlorophyll...On the other hand, the 
above teleological explanation of the occurrence of chlorophyll in plants is 
presumably concerned with living organisms having determinite forms of 
organization and definite modes of behavior—in short, with the so-called 
“green plants.” Accordingly, although living organisms... capable of 
maintaining themselves without processes involving the operation of 
chlorophyll are both abstractly and physically possible, there appears to be 
no evidence whatever that in view of the limited capacities green plants 
possess as a consequence of their actual mode of organization, these 
organisms can live without chlorophyll (p. 404). 


Here Nagel himself admits that the adequacy of the translation, the truth of the 
equivalence claim, rests entirely on matters of actual fact. Only empirical 
evidence can show whether or not the formulations are actually equivalent. Thus, 
whereas Nagel purports, on the one hand, to establish a purely logical point— 
that teleological and nonteleological statements are equivalent because of their 
‘asserted content’—he is nonetheless forced to admit, on the other hand, that 
this ‘equivalence’ can be established only in the light of empirical investigation. 

That Nagel’s translation schema is inadequate, that “The function of X is Y’ 
and ‘Y only if X’ are not equivalent, can be shown in yet another way. Nagel 
advances the following argument to ‘reinforce’ his claim that the formulations 
are equivalent: 


if a teleological explanation had an asserted content different from the 
content of ...(a) nonteleological statement, it would be possible to cite 
procedures and evidence employed for establishing the former that differ 
from the procedures and evidence for warranting the latter (p. 405). 


But, he claims, there are no such differences—or at least no important ones. 
However, consider again the bone marrow example. To establish (a) Bone 
marrow functions to produce blood cells (teleological statement), one needs only 
tọ trace the production of the cells in many individuals. But to establish 13) 
‘Blood cells are produced only if bone marrow is present’ (nenteleological 
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statement), one needs to determine not only that bone marrow produces blood 
cells (the evidence for (a)), but also that no other organ or structure assumes (or is 
capable of assuming) this role in the absence of properly functioning bone mar-. 
row. According to this ‘evidential’ criterion, then, the formulations are not 
equivalent. 

Indeed, if differences in evidence and procedure for justifying two statements 
are a sound indication of differences in their ‘asserted content’, it is doubtful 
whether Nagel’s own chlorophyll-photosynthesis example holds up. Consider 
the correlative formulations: 


A. The function of chlorophyll in plants is to enable plants to perform 
photosynthesis. 


B. A necessary condition for the occurrence of photosynthesis in plants is the 
presence of chlorophyll. 


Despite Nagel’s claim that they are equivalent, it is apparent that much stronger 
evidence is required to warrant B than to warrant A. Even if chlorophyll is, in 
fact, a necessary condition for the occurrence of photosynthesis, we do not need 
to know this in order to assert that chlorophyll has that function (teleological 
statement A.) We obviously do need to know it before we can assert the non- 
teleological statement B. In addition, different kinds of evidence are required in 
order to falsify A and B. Evidence which would disconfirm the nonteleological 
statement would not, at the same time, disconfirm the teleological one. For 
example, we would say that B is false if we could find some way of removing the 
chlorophyll and replacing it with another substance which also led to photo- 
synthesis in the plant. This would not mean, however, that A is false: it would 
still be true that the function of chlorophyll in plants is to enable the plant to 
perform photosynthesis—since this does not preclude the possibility that some 
other substance might also have that function. 

In view of the misfortunes of the verifiability criterion of meaning and its 
descendents, we are not entirely confident that differences in procedure and 
evidence for justifying (or falsifying) two statements are a sound indication of 
differences in their ‘asserted content’. If they are however (and Nagel believes 
that they are), then the differences in evidence cited above—for Nagel’s own 
photosynthesis case as well as our own bone marrow example—lead one to con- 
clude that Nagel’s claim for the equivalence of the teleological and nonteleo- 
logical statements is as unwarranted by the present argument as it was by his 
earlier ones. 


VIVIEN B. SHELANSKI 


History and Philosophy of 
Science Programme 
National Science Foundation 
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THE HIDDEN ASSUMPTION IN MACKAY’S LOGICAL PARADOX 
CONCERNING FREE WILL 


In a series of recent publications, MacKay has proposed what can only be called 
a paradox, concerning the notion of free will. (Cf. MacKay [1960], [1965], [1966], 
[1967], [1969], [x971].) MacKay purports to show that it is in some sense 
impossible to deny the validity of the belief in a free will (at least for the actor 
in question). 

At the outset we must distinguish two forms of the question of free will. 
One form, the traditional version, which MacKay’s arguments do not establish, 
and a second, weaker version, which MacKay seeks to prove. 

Historically the claim for a free will was the claim that one’s actions or 
decisions were outside the scope of ordinary physical determinism.. That is, 
arguing for the validity of free will was seeking to establish that one’s actions or 
decisions were not the result of causes but rather that they arose, to some extent, 
de vacuo. This is what is usually meant by the notion of free will and MacKay’s 
proposals do not establish this sense of the term. 

What MacKay purports to establish is a weaker version of the concept in 
which one’s actions or decisions can be completely causally determined in the 
strong sense for external observers but logically undetermined for the actor in 
question. MacKay conceives this situation as one of philosophical relativism in 
that one’s actions can be both determined (from another’s point of view) and 
undetermined from one’s own point of view. This then is the crux of the issue; 
in order to examine MacKay’s thesis we must examine the question of how it can 
be possible for an action to be both determined and undetermined simultaneously 
depending upon the point of view from which the situation is discussed. We 
must also examine what it means to say that something is logically undetermined 
and to ascertain how this differs from ordinary physical indeterminism. 

First let us consider the logically indeterminate situation of an actor (A). 
Assume that we wish to predict the future states of A’s brain. Any complete 
prediction must consider those states of A’s brain dealing with what A believes. 
The question is whether or not A believes our prediction. If A believes it then 
the prediction could not be correct both before and after his believing it, for if 
correct before then it describes A as not believing it and if correct after A comes 
to believe it then it could not have been correct before. That is, our prediction 
becomes a variable effecting A’s brain states thus making our prediction forever 
recursively incomplete. Furthermore, MacKay has argued rather cogently that 
the gambit of withholding our prediction from A will not allow us to escape from 
our dilemma. For if our prediction were a complete and binding one it would 
merit universal assent, logically, by all participants including A. However, A 
would be manifestly in error were he to believe our prediction for by that very 
act our prediction would be falsified. Thus there exists no determinate specifica- 
tion of A’s future beliefs which is binding on everyone, including A, before he 
decides. It is in this sense that MacKay would say that A’s decision is logically 
indeterminate. : ; 

. Contrast this situation with the completely determinate one, a complete 
prediction of which would merit universal assent by all observers. MacKay 
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would say, for example, that the motion of a planet is in principle completely 
predictable in terms that are logically binding on everyone. This is the ordinary 
situation of a complete physical determinism. It thus contrasts sharply with the 
situation obtaining in the case of A where A’s future brain states can be com- 
pletely determinate, in the same sense as the motions of the planet, for an 
independent observer, and logically indeterminate for A himself. 

We now come back to the questions initially posed: what is the nature of a 
logically indeterminate event and is a logically indeterminate event the same 
type of thing as a physically indeterminate event? In order to answer these 
questions we must determine how our two examples differ. What, precisely, is 
happening with A that differs from what is occurring in the case of the planet’s 
motion? 

The distinction between the two situations is that A, on the one hand, is a 
cognitive system while the planet is, on the other hand, a non-cognitive system. 
‘That is, our paradox obtains because A is taken to be the kind of a thing that 
can apprehend predictions while planets cannot. Stated in another way: In the 
case of A we are formulating statements (predictions) in one system (a cognitive 
system) about future events (beliefs) which are also part of that system. However, 
in the case of the planet we are formulating statements (predictions) in one 
system (again a cognitive system) about future events (planetary motions) which 
are part of a different system (a non-cognitive one). Thus our paradox obtains 
because we assume two distinct systems only one of which is granted the 
possibility of certain crucial characteristics (i.e. knowing, believing, deciding 
etc.). What I am saying is that the problem is one of information feedback in 
that we deny receipt of the necessary information to the planet. That is, our 
predictions, by their very nature, are members of a system to which the planet is 
denied access. We have formulated a fundamental dichotomy among events in 
our universe. On the one hand we have the ordinary physical processes as 
exemplified by the planet and on the other hand we have those peculiar processes 
which only occur in cognitive systems (believing or dis-believing predictions). 

Now in what sense can such a distinction be made? 

There is an obvious dualistic sense in which cognitive events (logically 
indeterminate events) are held to differ, in kind, from ordinary physical processes 
(physically determinate events) such as planetary motions. That is, by a dualistic 
gambit we can maintain that cognitive events are of a different sort of ontological 
class than ordinary physical events. In fact it is in just this sense that the dis- 
tinction must be made in order for the argument to proceed; for if we did not 
make this tacit assumption observe what would happen to our logical indetermi- 
nacy. If we pursue a monistic ontology of physical events then such things as 
cognitive events would themselves be physical processes (this is of course 
MacKay’s initial hypothesis). Thus to say that a cognitive event (such as a 
decision) is physically determinate and logically indeterminate, would be to say 
that a cognitive event (such as a decision) is both physically determinate and 
physically indeterminate. This is of course a contradiction! Thus in order for a 
physically determinate event to be logically indeterminate, cognitive events 
must be considered as unique ontological entities different from physical 
processes. This tacit dualism is at the heart of MacKay’s enterprise and is = 
price at which it is allowed to succeed. 
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It should be noted that if one wishes to posit a dualism of cognitive events 
and physical events then one has no need of MacKay’s paradox for one could 
simply assert that cognitive events, by virtue of their fundamentally different 
nature, lie outside the scope of physical determinism. However a much more 
serious objection can be levelled at MacKay’s thesis at this juncture. What 
MacKay’s entire programme is designed to do is to show that even were we to 
grant the mechanistic determinist all his assumptions we could still assure 
ourselves that our decisions are logically free. Thus MacKay’s initial hypothesis 
is that the universe forms a rigidly determinate interlocking system with A’s 
brain, and that for each cognitive event (beliefs, decisions, etc.) there is an event 
in A’s brain. In short MacKay’s hypothesis is that A’s brain is part of a determi- 
nate physical system precisely as the planet is part of a determinate physical 
system. MacKay then wants to say that in spite of this hypothesis A is still free. 
However, we have seen that the argument proceeds by virtue of a tacit dualistic 
assumption which stands in obvious contradiction to the stated foundational 
hypothesis—thus we see that MacKay’s thesis involves a hidden contradiction 
which vitiates the entire programme. 

Finally, I feel compelled to say a few words by way of ascertaining whether it 
is possible to provide a non-contradictory explication of MacKay’s notion of 
logical indeterminacy. It would seem that we could explicate the notion by 
appeal to a distinction similar to the distinction between the analytic and the 
synthetic. That is, we might say that the notion has cogency (as outlined in 
MacKay’s arguments) in that it tells us something about the logic of such con- 
cepts as choosing, believing, deciding, etc. Thus to say that A’s action is logically 
indeterminate would simply be to tell us something about the logical structure of 
our linguistic system. Of course this sense of the notion tells us nothing about 
the problem of free will—whatever else the free will problem is, it is not a 
question about the logic of the linguistic system. As noted above the only way in 
which the notion could say anything about the problem of free will is in a way 
which embroils us in a contradiction; that is, if logical indeterminacy is synthetic 
then it cannot obtain simultaneously with complete physical determinacy, and 
if it obtains simultaneously with complete physical determinacy then it can only 
do so by appeal to a tacit dualism, but if we appeal to a dualism then this stands 
in contradiction to the argument’s initial hypothesis. Thus the concept is either 
synthetic and contradictory or analytic and trivial. 

Thus I think that I have shown that there are only two possible meanings of 
MacKay’s notion of a logical indeterminism. One is trivial and does not in any 
traditional sense establish the validity of the concept of free will. The other, and 
this is what I suspect MacKay’s position in fact is, is a sense of the notion which 
hinges upon a hidden dualistic assumption. It is only by virtue of this implicit 
dualism that MacKay’s argument has any force at all. Far from showing, as he 
purports, that granting the mechanist his assumptions will lead to the affirmation 
of the validity of the notion of free will, rather MacKay has shown that granting 
those assumptions only a tacit dualistic gambit can save free will from destruc- 
tion. 


- LARRY W. DEWITT 
Northern Arizona University 
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THE LOGICAL INDETERMINATENESS OF HUMAN CHOICES 


In his [1972] McDermott berates Watkins ([1971]) for ‘completely misunder- 
standing MacKay’s intent’, and then, alas, proceeds to do the like at several 
pots. 

(z) ‘MacKay’s claim’ he says, is ‘that human actions are predictable and man 
is free’. In fact, I expressly disowned (e.g. MacKay [1967], p. 36 henceforth 
FAMU) the claim that all human actions are predictable. My point was that 
even if they were, there is at least one sense in which human choices could still 
be ‘free’: There would not exist (even unknown to anyone) a determinate complete 
specification of the future of a cognitive agent which had a well-founded un- 
conditional claim to the assent of anyone and everyone if only they knew it. 

(2) The above (see FAMU, pp. 12-13) was aimed precisely at what McDermott 
calls ‘apredictive determinism’: the claim that ‘future events are, even before 
they happen, unalterably fixed’. This claim does not necessarily imply that human 
actions are predictable; but it surely does imply that there exists (even if un- 
knowable by anyone) an unalterably fixed specification of the future which is 
already true, in the sense that anyone would be correct to believe it, and in error 
to disbelieve it, if only he knew it. It is this ontological claim that my argument 
(I think) refutes. I have of course dealt also with the case of imaginary predictors; 
but the force of the argument is not limited to predictive situations, nor is it 
making only an epistemological point. McDermott’s title indeed highlights his 
misunderstanding. It is not because ‘I don’t yet know’ what I am going to do, 
but because what I am going to do zs not as yet ‘unalterably fixed’ in the foregoing 
sense, that I am free. 

(3) I was careful not to claim that the observer’s and agent’s descriptions are 
both ‘true’, I have long argued (e.g. MacKay [1961]) that ‘true’ should be kept 
for descriptions that anyone and everyone would be correct to believe and in 
error to disbelieve. What I claim is that in these peculiar situations it is the 
validity of beliefs (putative or actual) rather than the truth of propositions, that 
_ can be objectively and unequivocally assessed (MacKay [1960], p. 39; FAMU, 

pp. 26-27). 
© (4) The idea that ‘indeterminate-for-A implies determinable-by-A’ (p. 349) is 
manifest nonsense, and only a misreading of the passage he “quotes on p. 345 

can excuse McDermott’s attributing it to me. What the Laplacean Intelligence 
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(L.I) ‘claimed to have (MacKay [1971], p. 283) was ‘evidence that A-as-a- 
cognitive-agent faced an outcome indeterminate-for-him which it was in his 
«power to determine’. The claim italicised is meant in its ordinary everyday sense. 
It is not a deduction from the indeterminacy of the outcome, as McDermott 
takes it to be, but depends on the (mechanistic) evidence available to L.I. (see 
FAMU, pp. 20-22); and it is added as an essential precondition of the dis- 
cussion’s making any sense at all. The question at issue has been: If, in those 
situations where it is agreed that in the usual sense A has the power to do X, it 
is nevertheless also agreed that his doing X is part of a physically determinate 
chainmesh of events—can we still regard A as ‘free’ in any meaningful sense? 
The answer traditionally offered is ‘No—any accompanying feelings of freedom 
A has must be illusory, and A is simply deluded if he believes that the outcome 
is not already fixed. To expose the fallacy underlying this answer, I have asked: 
what is the ‘truth’ about the outcome that would, if only A knew it, remedy the 
condition you describe as ‘deluded’? The conclusion of my argument is that no 
such ‘truth’ exists. Instead, A is correct to regard his future action as unde- 
termined, not indeed in the sense of ‘uncaused’, but in the sense summarised in 
(2) above. The traditional answer is thus mistaken. Scientific determinism does 
not prove A’s feeling of freedom to be illusory. On the contrary, careful thought 
merely confirms A’s intuition that in the cases in question the outcome is still 
up to him, and that no specification of it exists, unknown to him, which he would 
be correct to accept as inevitable if only he knew it. 

Larry DeWitt’s paper (DeWitt [1973]) raises two kinds of question. The 
first concerns the relation between logical and physical determinateness/inde- 
terminateness, and is I think partly answered above, as well as in my [1971] and 
earlier papers. Briefly, I see no contradiction, apparent or otherwise, in saying 
that event X has a determinate and unique specification (in the sense in which 
an equation can have a determinate and unique solution) with a well-founded 
unconditional claim to O’s assent but not to A’s. This, as I have often explained, 
is what I mean by calling X logically indeterminate for A but not for O. If anyone 
wants to use the term non-relativistically, then I think he must describe X as 
logically indeterminate tout court, in the sense that it has no determinate and 
unique prior specification with a well-founded unconditional claim to the assent 
of anyone and everyone (including A). 

To call an event physically or scientifically determinate, on the other hand, 
means (in my usage) that it fits tightly into a causal chainmesh, sufficient to 
account completely for every detail of it, at least in retrospect. There is nothing 
relativistic about this. A is not precluded from believing that the outcome of his 
choice is physically determined (i.e. will turn out in due course to have had 
adequate causal antecedents in the prior configuration of the physical universe). 
He is only precluded from inferring from this belief that the outcome is already 
‘fixed’ beforehand in the sense explicated in (2) above. 

This brings me to DeWitt’s second question: what ‘hidden assumption’ must 
one make to sustain my argument? The answer is given explicitly in my [1967]": 


We have claimed to be working out some consequences of the most ` 
crassly mechanistic view of the brain, as a clockwork mechanism within a ` 


1 MacKay [1967], pp. 14-15. 
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clockwork universe. How on earth have we moved from such a starting point 
to talking of people as irreducibly indeterminate and mysterious to one 
another? Something, surely, has been surreptitiously imported along the 
way? 

And of course something crucial has been imported, though not sur- 
reptitiously. This is the assumption... made by the mechanist himself, 
that what a man believes and thinks is rigorously reflected in the state of 
his brain. It is this very assumption—one that to many anti-mechanists has 
appeared as a deadly challenge—which generates these peculiar logical 
consequences, It amounts to postulating that certain parts of the domain of 
physical science—namely, the brains of cognitive subjects—have the task 
of representing that domain of discourse itself. Since sentences that talk about 
themselves are notoriously treacherous in logic, we need not be surprised 
that we have to accept some indeterminacy—some lack of complete detail— 
in any description of brain machinery that must be self-consistently re- 
presentable by that same brain machinery. 

No, the root of mystery in our analysis is the brute fact of cognition itself. 
Deny that the individuals under discussion are cognitive subjects, and the 
question of what they would be correct to believe disappears. The whole 
physical domain of discourse would thus become fully specifiable in theory 
—but then, of course, we would no longer be talking about human beings 
like ourselves, and about what our brain science allows us to believe, which 
was the original purpose of our discussion. In a sense, then, all we have been 
doing here is to spell out some implications of the mysterious but empirically 
undeniable correlation between our conscious experience and what goes on 
inside our heads. 


Does this entail ‘a dualism of cognitive and physical events’? Certainly not in 
DeWitt’s sense; for he states that this would allow one to ‘assert that cognitive 
events... lie outside the scope of physical determinism’; whereas the key 
assumption whose consequences I explore is that cognitive experience is 
rigorously correlated with physically-determinate brain-events. The flaw in 
DeWitt’s argument comes where he asserts that (on monistic assumptions) ‘to 
say that a cognitive event . . . is physically determinate and logically indeterminate 
would be to say that (it) is both physically determinate and physically inde- 
terminate’. Given the meaning of ‘logically indeterminate’ explained above, this 
simply does not follow. There is no need to assume that the events of cognitive 
experience and physical events are two sets of events, in order to show the logical 
indeterminacy of some future events of cognitive experience for the cognitive 
agent, 

To put it positively, my argument starts (openly) from the brute fact that we 
have conscious experience, and that we sometimes find this to be correlated with 
our brain activity. It asks what would follow if all our conscious experience were 
rigorously so correlated, and if all our brain activity were physically determinate 
in the above sense. It shows without further assumption that even so, no complete 
determinate specification of our future could exist that we would be uncon- 
ditjonally correct to accept as inevitable if only we knew it in other words, our 
future would still be logically indeterminate for us and those i in dialogue with us. 
There may well be certain brands of reductionist monism which are hard to 
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square-with this conclusion—indeed I think there are; but if there is no flaw in 
the argument itself it is the reductionist who must take the knock. My own view, 
for what it is worth, is that both classical dualism and reductionist monism are 
unsatisfactory attempts to conserve hierarchically complementary truths about 
our human nature. If I may quote from an earlier paper’: 


The reductionist recognizes the autonomy of explanatory behavioural 
principles at the physical level. The dualist recognizes that the reality of 
what it is to be an agent is richer—has more to it—than can be described 
either in mind talk or in brain talk alone. Our suggestion is that it is a 
mistake to regard these emphases as contradictory. It is not extra-physical 
forces that we must admit in justice to the facts that dualism wants to express, 
but additional logical dimensions, further aspects of the situation, requiring 
personal categories which are still necessary when we have exhaustively said 
all we can at the physical level. 


No step in my main argument, however, hangs (covertly or otherwise) on the 
acceptance of this intermediate position, or on the rejection of any ‘monism’ that 
is compatible with the brute facts I have referred to. 


D. M. MACKAY 
University of Keele 
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Surres, P. [1970]: A Probabilistic Theory of Causality. Acta Philosophica 
Fennica, Fasc. XXIV. Amsterdam: North-Holland Publishing Company. 
1970. Df. 27.00. Pp. 130. 


This is the first book-length philosophical monograph by one of the most 
brilliant, versatile and prolific scholars of our time. It consists in an exact modern 
version of Hume’s mistaken view of causation as constant conjunction. Suppes 
expounds this view, with his customary rigor and clarity, alas with considerable 
philosophical naiveté, in the first half of his work. The other half is about equally 
divided into a quick and disconnected review of some related philosophical 
issues, such as free will, and a useful appendix on probability. The list of 
references presumably reflects the author’s attitude towards philosophical 
tradition: Suppes ignores at his risk most of the literature on a subject which is 
as old as philosophy. 

A cause, in Suppes’s view, is anything that probabilifies. More precisely, if By 
is an event at time ¢’ and A; an event at a later time t > 7’, then By is said to be 
a prima facie cause of A, just in case the conditional probability of A; given By 
is greater than the absolute probability of A,;: P(AiBy) > P(A,). A sufficient 
or determinate cause is defined as a cause that produces its effect with probability 
one. And a spurious cause is a prima facie cause that has no influence on the 
outcome. 

The whole monograph consists practically in a careful exploitation of these 
definitions with the help of the probability calculus, as well as in the defence of 
this noncausal view of causation. Hence despite the title no theory proper is 
being offered in this volume, which boils down to some definitions and their 
consequences. The only (tacit) hypothesis is what physicists misleadingly call 
the ‘causality condition’, namely the assumption that influences precede their 
effects. (I object to the name, which Suppes accepts, because the principle may 
and often does concern random variables. ‘Antecedence principle’ seems to be 
a more adequate name.) There is no other assumption, not even—as the author 
himself emphasises—the hypothesis that every event has some cause. The book 
expounds then a noncausal nontheory of causation. 

Suppes’s overhauling of Hume’s analysis of causation is as inadequate as the 
latter for the following reasons. Firstly, it consecrates the post hoc ergo propter hoc 
fallacy, well known to the schoolmen who pointed out that the rooster’s first song 
announces dawn without causing it. Consider the events 


B, = The barometer is falling (here) at time ¢’. 

A, = It rains (here) at time t > t’. 
Since the probability of rainfall at a certain place, given that the barometer has 
been falling, is greater than the absolute probability of rainfall, Suppes asks us to 
regard the falling of the barometer as a cause of rain. Second, by the same token 


the Hume—Suppes analysis does not distinguish between sustained (but perhaps 
accidental) fositive correlation, and causation. Hence the whole point of speaking 
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of causes, as distinct from both functional relations and statistical correlations, 
is missed. ‘Cause’ is devalued to the point that it pays for anything. 

. Thirdly, Suppes adopts uncritically the vague notion of an event used in 
- probability theory, where an event is just an element of the field of sets constitut- 
ing the domain of definition of the probability measure. Fourth and consequently, 
he cannot distinguish events proper (changes in the states of things) from proper- 
ties and states or even from purely conceptual objects. Since anything that 
probabilifies is taken to be a cause, even prior states count as causes of later 
states. Thus because the probability of weighing 10 lb at birth is favourably 
_ relevant to weighing 200 lb at age 10, we would have to count the former as a 
cause of the latter—rather than investigating, say, the weight regulation 
mechanism of the child. Fifth, for the same reason (i.e. for not having faced the 
problem of characterising the notion of an event), Suppes is led to speaking of 
negative events (such as not catching a cold) and therefore of logical events (such 
as catching a cold or not catching a cold)—which is all right as long as the 
probabilistic construal of ‘event’ is kept but is of no help to the scientist and the 
philosopher, to whom there are only ‘positive’ events and (affirmative or negative) 
propositions about events. 

Sixth, because the ‘events’ considered by Suppes are conceptually possible 
items of any kind, not actual facts such as changes in concrete things, his 
pseudocausal relation holds among possibles not actuals. This, besides being a 
nuisance in itself, has the undesirable consequence that disjunctive ‘events’ 
(such as A U B) may have a greater causal efficacy than their components—even 
though no disjunctive events ever occur. In fact there are no disjunctive occur- 
rences such as going for a swim or staying home to read: disjunction is a mark of 
possibility not of actuality. (Parenthetically, Suppes’s axioms of occurrence on 
page 38 fail to elucidate the concept of actual occurrence of an event not only 
because they fail to specify that the events must be real not conceptual, but also 
because they assert, among other things, that if event A occurs then ‘event’ 
A U B occurs.) Seventh, because the Hume—Suppes explication of causation is 
independent of the concept of a law of nature, it does not allow one to spot what 
I have called the causal range of a law statement, i.e. the range over which the 
events described by the statement may be said to be causally related. 

Most of the above seven objections to the Humean analysis of causation have 
been discussed in the vast philosophical literature on causality, which Suppes 
has chosen to ignore for the most part. (Being exclusively concerned with the 
literature won’t get you ahead of the librarian, but ignoring it altogether may 
make you either discover the Mediterranean or miss it.) Nevertheless this study 
is worth reading, if only because it reconstructs in exact terms a philosophical 
doctrine that is as important as it is false. 

MARIO BUNGE 
McGill University 
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Tonni, L. [1973]: Scientific Procedures (translated from the Czech by D. Short). 

Vol. ro in R. S. Cohen and M. W. Wartofsky, (eds.): Boston Studies in the 
Philosophy of Science. Dordrecht: D. Reidel Publishing Company. Pp. xiii 

268. . 


Volume ro of the Boston Studies in the Philosophy of Science consists of a transla- 
tion from the Czech of a work by Ladislav Tondl, The editors of the series have 
perhaps not chosen to present to English readers the most original of Professor 
Tondl’s contributions on the foundations of science. Apart from an emphasis on 
the informational or communicational conception of scientific activity and on the . 
so-called finitistic approach to scientific procedures, much of the material covered 
will be familiar from textbooks on philosophy of science. The treatment of these 
topics, however, is for the most part thorough and competent, 

The work opens with a characterisation of science as cognitive activity whose 
aspects are distinguished into methodological and theoretical, institutional, 
sociological and psychological, and ideological. It is with the first of these that the 
work is concerned. Comments on the remaining aspects are brief. (Thus, 
concerning the last (p. 3): “The achievements of science cannot but influence 
what is traditionally known as Weltanschauung and fundamental ideological 
attitudes to society and to the position of man in the world and society’.) The 
goals of cognitive activity in science are described as explanation, prediction, 
verification, confirmation, constitution (Aufbau), reduction, etc. The methods 
by which these are achieved are called scientific procedures (in the broad sense of 
the title) and regarded as operations with data. 

After a chapter covering proper names, descriptions, comparative and 
quantitative predicates, empirical, dispositional and theoretical predicates, 
similarity and identity, the author turns to the treatment of scientific explanation 
and prediction. The task of giving a scientific explanation is regarded generally 
as the problem of answering a why-question. The author also analyses the structure 
of whether-questions and which-questions, following Giedymin, Harrah and 
Belnap, and gives an application to the problem of crucial experiments. He is 
in error, however, to suggest (p. 140) that provided two hypotheses are incom- 
patible (exclusive), the refutation of one justifies the acceptance of the other. 
The author is not concerned to add yet another typology of scientific explanation 
to the literature, but claims that most typologies have been drawn up in an 
unsystematic way. He points out that types of explanation can be distinguished in 
accordance with the nature of the explanans, the nature of laws appearing in 
the explanandum, the relation between explanans and explanandum, and in 
accordance with pragmatic considerations (degree of satisfactoriness, complete- 
ness, efc.). This section makes an interesting contribution. Accepting the 
essential role of laws in scientific explanation, the author is led to a treatment of 
the instrumentalistic viewpoint of laws, which he rejects, and the problem of 
distinguishing scientific laws from so-called accidental generalisations. The 
arguments against instrumentalism are based on a very narrow understanding of 
this position (laws as inferential rules) and are of unequal merit. Thus the author 
asks how is it possible to imagine the ‘conjunction’ of rules with initial conditions, 
as laws are understood to be conjoined to initial conditions jn explanation. After 
a thorough examination of various attempts of a purely syntactical or semantical 
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nature to distinguish laws and accidental generalisations, the author proposes 
pragmatic criteria based on explanatory and predictive power, as explicated by 
members of the Finnish School. 

Turning to the question of statistical explanation, the well-known difficulties 
in Hempel’s conception are pointed out and the author proposes, in its place, 
what he calls the ‘decision model’. The remarks on this model seem to be still 
of an exploratory character. It is said to operate with input and output spaces 
and a communication channel between them. To explain a certain element of 
the output space (an experimental finding, e.g. a patient’s symptoms) means 


_ finding a decision function making it possible to assign to each such element an 


element of the input space (state of the system, e.g. pathological state) or a 
probability distribution over at least part of that space (p. 233). This is un- 
fortunately the only explicit connection stated between the communication 
conception and scientific explanation. (There is therefore perhaps less in common 
between this explication and one recently given by J. G. Greeno, for example, 
than might at first sight appear.) The book concludes with a discussion of the 
structural identity of explanation and prediction and of the justification and 
evaluation of prognostic statements. 

The author states in his Preface that the work ‘sets out from the viewpoint 
which might be described as an informational or communicational conception of 
scientific activity’. It is intended that this viewpoint should be reflected through- 
out the work. At an early stage therefore ‘the communication model of cognitive 
activity’ is presented. This consists of three blocks: (A) source, (B) channel, 
(C) output. Block A consists of a set of objects or events, B may be a human 
observer or one furnished with apparatus, C is a system of statements. There is 
understood to be a two-way linking of A, B and B, C. The nature of this linking, 
however, is left ambiguous. The interaction of A and B seems to be of a physical- 
empirical character. Yet the action of B on C would only have this character if 
the ‘statements’ comprising C were understood as assertions or beliefs. But the 
exact character of the reaction of C on B would then remain obscure, as would that 
of C on A—which is also asserted (pp. 36, 88, 89). These deficiencies, however, 
do not mar the greater part of the work, in which no explicit technical use is 
made of the communication model. 

Corresponding to the postulated finite discriminatory capacities of block B and 
to the viewpoint of scientific procedures as operations with data, the author 
proposes the principle of methodological finitism which ‘assumes that the set of 
means used in the solution of a given task is respected as finite’ (p. 115). This 
principle, however, finds little application in the work except as a motivation for 
replacing a, so to speak, non-operational Leibnizian concept of physical identity 
by one of more effective character (pp. 115-19). 

The author suggests in his Preface that there is some conflict between his 
conception which accentuates the procedural approach and employs the methods 
of information theory and what he calls ‘the static conception of logico-syntactic 
analysis as developed by the analytical and positivist schools’. It is not clear 
what the author has in mind by the latter. However, the text itself would suggest 
that this conception and these methods are complementary to, rather than being 
in conflict with, the approach of, say, the present Polish logicians. (The only 
noticeable difference of opinion, which is of little importance here, cgncerns the 
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questionable identification of analyticity, L-truth, and the quality of being a 
scientific law—see footnote 20, p. 145.) 

The translation reads smoothly though a few individual words have gong 
astray, ¢.g.: ‘impulse’ instead of ‘momentum’ (p. 31), ‘capacity’ instead of . 
‘volume’ (p. 114), ‘relativist’ instead of ‘relativistic’ (p. 117). Unfortunately the 
book is published without an index. 

P. M. WILLIAMS 
The University of Sussex 


Kau, R. (ed.) [1971]: Selected Writings of Hermann von Helmholtz. Wesleyan 
University Press. Pp. xlv-+-542. 


Helmholtz’s activities covered a wide field: besides his contributions to physics 
and physiology, he wrote on non-Euclidean geometry and its epistemological 
implications, went out of his way to disseminate scientific ideas amongst the 
general public, and became a considerable public figure. This is a collection of 
some of his less technical papers, some of which Mr Kahl has translated himself, 
and others of which he has edited from previous translations. 

A man of so many interests requires a correspondingly large range of abilities 
in a translator. Most of these Mr Kahl has; one however is conspicuously lacking, 
namely an adequate knowledge of the German language. This is best shown by 
examples. 

To be fair, let us look at one of the papers which Mr Kahl has translated 
entirely himself, namely ‘The Facts of Perception’ (‘Die Thatsachen in der 
Wahrnehmung’), which we shall compare with the text of Helmholtz’s Vortrdge 
und Reden, vol. II. Being an address given at Berlin University in 1878 on the 
anniversary of its foundation, it starts with a comparison of the state of science 
and the nation in 1878 and 1810 (the year of foundation), before going on to 
define Helmholtz’s views on the nature of space as against those of Kant. 

Mistranslations, omissions and misrepresentations of Helmholtz’s intentions 
abound throughout the translation, and often ones which are astonishingly 
elementary or even nonsensical. For instance, Helmholtz says that an intuition 
is popularly regarded as something simply given without effort, and then 
continues: ‘Dieser populären Meinung schliesst sich ein Theil der physiologischen 
Optiker an ...’. This is translated: “This popular interpretation . . . is due in part 
to certain theorists in physiological optics... .’ Now, the verb ‘sich anschliessen’ 
cannot be translated ‘to be due to’. But even an elementary knowledge of the 
German case system would reveal that something is wrong here. ‘Dieser Meinung’ 
is not in the nominative case, and thus cannot be the subject of the sentence. 
While since ‘Theil’ is a masculine noun, ‘ein Theil’ (translated ‘in part’) must be in 
the nominative, and therefore the subject. The sentence in fact means:‘ A number 
(literally, “a part”) of the workers in physiological optics . . . attach themselves 
to this popular opinion . . .’, ¢.e. almost the contrary of the translation given. 

When such elementary mistakes occur, one expects that the translator will be 
tripped up by the intricacies of German syntax, as indeed happens in our 
present case—sometimes with ridiculous results. For instance, Helmholtz says 
in his introduction: ‘politische Freiheit giebt zunächst den gemeinen Motiven mehr 
Schrankenlosigheit sich zu zeigen und sich gegenseitig Mut zu machen, so lange ilmen 
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nicht eine zu energischem Widerspruch geriistete öffentliche Meinung gegeniubersteht.’ 
This is translated: ‘Political liberty allows the baser motives more freedom to 
display themselves and, provided they are not too energetic, to generate forces 
- opposed to prevailing public opinion.’ Apart from irritating minor inaccuracies 
(‘Schrankenlosigkeit? means ‘licence’ rather than ‘freedom’, and ‘sundchst’ is 
simply omitted—making Helmholtz sound more antidemocratic than he was), 
the end of this sentence is practically nonsense in English. The meaning is in fact: 
‘Political liberty initially gives the base motives more licence to show themselves 
and encourage one another, as long as they are not faced with a public opinion ready 
_ to offer energetic opposition.’ 

Sometimes our translator makes Helmholtz contradict himself. For instance, 
Helmholtz says that the less mentally endowed animals are, the quicker they 
learn to orientate themselves after birth, whereas human children take days to 
learn how to turn their head towards their mother’s breast. He continues by 
saying that young animals are at all events ‘much more independent’ (‘viel 
unabhängiger’) of individual experience. But oddly the translation given is 
‘quite independent’, although Helmholtz has just said precisely that they are not 
quite independent, but only more independent the more primitive they are. 

As Kant is the object of Helmholtz’s attention in this address, one would 
expect some correlation with standard translations of Kant. Yet when Helmholtz 
says that the qualities of our sensations are not to be dismissed as ‘leerer Schein’, 
this phrase is translated ‘empty appearances’, although ‘appearances’ is the 
standard translation of ‘Erscheinungen’, and Kant emphasises that by the latter 
he does not mean ‘Schein’ (e.g. Kritik der reinen Vernunft, 2nd ed., p. 69). 

The result of these errors, and of many others which it serves no purpose to 
list further, is that nobody who wishes to study Helmholtz at all seriously can 
employ this translation of the address concerned, nor by implication the rest of 
the collection. 

But where does the responsibility lie for this disaster? Perhaps not so much 
with the translator himself, who presumably struggled for hours to extract some 
sense from a text beyond his capabilities. The book is published by a University 
Press, which should have the expertise to check the translation against the 
original, although this obviously did not happen. 

One would expect that the faults of such a book will at least be noted when it 
reaches review in reputable journals. Yet in one of its reviews (in the Scientific 
American) it is baldly stated that “The translation and editing are mainly new 
and generally excellent’, which could not be truthfully said by anyone who had 
seriously examined the accuracy of translation. 

. So whom can the English speaking reader trust, if neither translators nor 

publishing company nor reviewers can apply anything approaching scholarly 

standards? And how much more work of this kind is being offered as translations 

to an English speaking public which is so rapidly losing the ability to read 
scientific classics in the original German?* 

MALCOLM LOWE 

The Van Leer Foundation 

Jerusalem 

* [Editor’s Note]: Incidentally, there exists an alternative English translation of Helmholt2’s 

‘Die Thatsachen in der Wahrnehmung’ in R. M. Warren and R. P. Warren: Helmholtz 

on Perception, John Wiley, 1968. 
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